MySQL Bugs: #118467: MySQL user accounts lost after restoring backup on NDB data nodes

Bug #118467	MySQL user accounts lost after restoring backup on NDB data nodes
Submitted:	17 Jun 8:38	Modified:	18 Jun 10:39
Reporter:	ZHAO SONG	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	mysql-8.0.41	OS:	Any
Assigned to:	MySQL Verification Team	CPU Architecture:	Any

Description:
When restoring a backup on NDB data nodes, MySQL user accounts are lost.

I created a user in MySQL with the NDB_STORED_USER privilege, performed a backup, and then restored it. After the restore, the user I had created was missing. I investigated the source code and found several issues that cause this behavior. I managed to fix them locally, and I’d like to report my findings here:

Issues in ndb_restore:

1. There is a --include-stored-grants option in ndb_restore that is intended to restore stored grants, but its implementation is effectively missing—it has no impact in practice.

2. In the function is_included_sys_table() (Version 8.0.41, located at restore_main.cpp:1765), the check only matches the pattern "mysql/def/ndb_sql_metadata" but fails to include the internal table for the primary index (sys/def/XX/PRIMARY). As a result, the primary index is mistakenly excluded.

3. In the function do_restore() (restore_main.cpp:2413), the table mysql/def/ndb_metadata_sql is handled in a way that causes its primary index to be skipped during the rebuild process.

4. In the function BackupRestore::createSystable() (consumer_restore.cpp:2652), both mysql/def/ndb_sql_metadata and its corresponding internal primary index table (sys/def/XX/PRIMARY) are skipped, preventing the table from being created at all.
5. If we add these two tables back into BackupRestore::createSystable(), the subsequent call to dissect_table_name() (consumer_restore.cpp:2659) fails, as it cannot properly parse the name of sys/def/XX/PRIMARY.

After addressing the above 5 issues, the mysql.ndb_metadata_sql table and its records are correctly restored to the data nodes—meaning MySQL user information is successfully recovered. However, 2 more issues remain on the NDB plugin side:

Issues in the NDB plugin:

1. If MySQLd is shut down before the restore, it fails to start up afterwards. This appears to be related to a previously reported issue: MySQL Bug #118337 (https://bugs.mysql.com/bug.php?id=118337). I applied a temporary workaround locally to bypass this. Once MySQLd starts, it correctly reads the restored user information, confirming that the fixes above are effective.

2. If MySQLd remains running during the restore, the situation becomes more complex. After running '--initial' on the data nodes, the Ndb_binlog_thread detects that the Ndb_schema_dist_table is missing and begins reinitialization. If Ndb_stored_grants::setup() is triggered before ndb_restore finishes restoring mysql.ndb_metadata_sql, the NDB plugin on MySQLd will compare the local data dictionary (DD) with the (empty) NDB DD and overwrite the restored user data. I worked around this by forcing the Ndb_binlog_thread to wait until ndb_restore finishes, which prevents the overwrite and allows MySQLd to retain the restored user information.

How to repeat:
1. Create a user in MySQL.

2. Grant the NDB_STORED_USER privilege to this user.

3. Take a backup using ndb_mgm on the management node.

4. Shut down MySQLd (or optionally leave it running).

5. Restart and reset all data nodes using the '--initial' parameter.

6. Restore the backup to the data nodes in the following order:

Restore metadata

Disable indexes

Restore data

Rebuild indexes

7. If MySQLd was shut down in step 4, restart it. It will hang during startup due to Bug #118337. If you apply a temporary workaround to let MySQLd start, you’ll find that the user created in step 1 is missing.

Suggested fix:
My local fix addresses all the issues I mentioned above. However, based on what I’ve seen in the source code, ndb_restore is only designed to restore two system tables: mysql.ndb_schema and mysql.ndb_apply_status. It seems that restoring other system tables—like mysql.ndb_metadata_sql—was never intended to be part of its original design. So this might require a design-level change rather than just code fixes.

I’d be happy to share my local patches if you think they’d be helpful for reference.

By the way, I couldn’t find any official documentation that clearly explains how to properly perform a backup and restore. It's quite confusing—for example, users don’t even know whether MySQLd should be shut down during the restore process, and either choice currently leads to issues in this case.

In scenario 2 "If MySQLd remains running during the restore,..." is restore started before the MySQL servers has completed their initialization?
Does the problem remain if one first wait for MySQL servers to complete the initialization and then start the restore?

A1:
In both scenarios I tested, I did not re-initialize MySQL. From what I can tell after reading the code, there doesn’t seem to be any strict requirement to re-initialize MySQL when restoring data nodes. Is there a specific reason or documented case where re-initialization is required?

A2:
Regarding Bug #118337, re-initializing MySQL can serve as a workaround:

  a. Reset the data nodes (--initial)

  b. Re-initialize MySQL (e.g., remove datadir and run --initialize-insecure)

  c. Start MySQL so it can create its internal tables (e.g., ndb_index_stat)

  d. Shut down MySQL

  e. Restore data on the data nodes

In this approach, steps (b) and (c) ensure that the ndb_index_stat_head table and related events are properly initialized, which avoids the startup hang in step (d).

However, for the user loss issue, this workaround won’t help. The user data in mysql.ndb_metadata_sql will still be missing after restore, unless the five issues in the ndb_restore code I mentioned earlier are properly fixed.

I see the below bug is fixed in 8.0.42 Bug#117230 ndb_restore "--include-stored-grants" not restoring "ndb_sql_metadata" table.

Does your problem persist also with 8.0.42?

Just did a quick review of the 8.0.42 code. It appears to include a fix similar to my local one — all 5 code locations I mentioned above have been updated. So, the issue on the data node side seems to be resolved.

However, I haven’t seen any changes related to the MySQL hang issue(https://bugs.mysql.com/bug.php?id=118337). I suspect that in 8.0.42, MySQL may still hang during restart after restoring the data nodes.

Update on testing MySQL 8.0.42:

a. Scenario 1: Restore while MySQL is down

  ndb_restore is now able to restore data from mysql.ndb_metadata_sql. However, MySQL still hangs on restart due to bug 118337.I applied my local workaround (modifying Ndb_util_table::create() and Util_table_creator::create_or_upgrade_in_NDB()), and with that, MySQL restarts successfully and retrieves the user information.

  So I think once bug 118337 is fixed upstream, this scenario should work as expected.

b. Scenario 2: Restore while MySQL is running

  Even though ndb_restore is restoring data on the data node, the Ndb_binlog_thread detects a cluster restart and triggers re-setup of Ndb_grant_stores, wiping out user data in the local data dictionary.

  Fast Workaround: Introduce a delay in Ndb_binlog_setup after detecting "initial system restart" to give ndb_restore time to complete.

Conclusion:
  8.0.42 resolves part of the issue (roughly 1 out of 3 key areas)

Closing the report as this is fixed in .42