Description:
In a 4-node NDB Cluster deployment (mgmd + 4 combined SQL/data nodes), after a cluster instability/restart window, mysql.ndb_schema becomes missing on all MySQL Server nodes (ERROR 1146: Table 'mysql.ndb_schema' doesn't exist). Only mysql.ndb_apply_status and mysql.ndb_binlog_index are present.
However, mysqld error log shows a sequence during the same time window indicating a “normal system restart” and an attempt to reinstall and create mysql.ndb_schema, including that the replication event already exists. This appears to leave the system in an inconsistent state where schema distribution metadata actions are logged but the table is not actually present/visible to SQL.
Evidence (log excerpts):
mgmd log (/tmp/mgmd_0949_0955.log), showing instability and mgmd stopping:
Line 11–12: ... Arbitration error ... Node 4 is not responding
Line 2022: *** Received SIGTERM. Performing stop. ***
mysqld log on ndb1 (/tmp/ndb1_0949_0955.log), showing restart + util tables problems + reinstall attempts:
Lines 201–204 (cluster failure and util tables lost):
... cluster failure at epoch: ...
... Ndb kernel state: ...
... InnoDB: Could not recover util tables from NDB
... InnoDB: util tables lost, they need to be recreated
Lines 220–228 (normal restart + reinstall/create ndb_schema + existing event):
Detected a normal system restart
Table 'mysql.ndb_schema' need reinstall in DD
Creating table 'mysql.ndb_schema'
Event 'REPL$mysql/ndb_schema' for table 'mysql.ndb_schema' already exists
Binlog: logging ./mysql/ndb_schema (UPDATED,USE_WRITE)
SQL verification on all nodes (ndb1–ndb4), after the incident window:
Command executed:
for n in ndb1 ndb2 ndb3 ndb4; do
docker exec -i $n bash -lc "mysql -uroot -e \"
SELECT @@hostname, @@version, @@ndb_version;
SHOW TABLES FROM mysql LIKE 'ndb_%';
SHOW CREATE TABLE mysql.ndb_schema\G
SELECT COUNT(*) AS ndb_schema_rows FROM mysql.ndb_schema;
\""
done
Observed output (consistent on all nodes):
SHOW TABLES FROM mysql LIKE 'ndb_%'; returns only:
ndb_apply_status
ndb_binlog_index
SHOW CREATE TABLE mysql.ndb_schema fails with:
ERROR 1146 (42S02): Table 'mysql.ndb_schema' doesn't exist
How to repeat:
Deploy MySQL NDB Cluster with 1 mgmd and 4 nodes running both ndbd + mysqld.
Create and use several NDB tables (e.g., under schema mytest) and run concurrent DDL/DML workload.
Trigger or encounter a cluster restart/instability (e.g., data node non-response/arbitration issues; mgmd stop/start).
After restart, run the SQL verification loop above on each mysqld node.
Observe mysql.ndb_schema missing (1146) on all nodes, while mysqld logs show reinstall/create attempts for mysql.ndb_schema.
Suggested fix:
Investigate schema distribution / DD recovery logic after cluster restart, especially around util tables recovery and the reinstall flow for mysql.ndb_schema. Ensure that:
mysql.ndb_schema creation is durable and results in a visible table object in MySQL DD,
recovery handles the case where REPL$mysql/ndb_schema event already exists without leaving partial state,
util tables recovery does not silently leave the system without required schema distribution tables.
Description: In a 4-node NDB Cluster deployment (mgmd + 4 combined SQL/data nodes), after a cluster instability/restart window, mysql.ndb_schema becomes missing on all MySQL Server nodes (ERROR 1146: Table 'mysql.ndb_schema' doesn't exist). Only mysql.ndb_apply_status and mysql.ndb_binlog_index are present. However, mysqld error log shows a sequence during the same time window indicating a “normal system restart” and an attempt to reinstall and create mysql.ndb_schema, including that the replication event already exists. This appears to leave the system in an inconsistent state where schema distribution metadata actions are logged but the table is not actually present/visible to SQL. Evidence (log excerpts): mgmd log (/tmp/mgmd_0949_0955.log), showing instability and mgmd stopping: Line 11–12: ... Arbitration error ... Node 4 is not responding Line 2022: *** Received SIGTERM. Performing stop. *** mysqld log on ndb1 (/tmp/ndb1_0949_0955.log), showing restart + util tables problems + reinstall attempts: Lines 201–204 (cluster failure and util tables lost): ... cluster failure at epoch: ... ... Ndb kernel state: ... ... InnoDB: Could not recover util tables from NDB ... InnoDB: util tables lost, they need to be recreated Lines 220–228 (normal restart + reinstall/create ndb_schema + existing event): Detected a normal system restart Table 'mysql.ndb_schema' need reinstall in DD Creating table 'mysql.ndb_schema' Event 'REPL$mysql/ndb_schema' for table 'mysql.ndb_schema' already exists Binlog: logging ./mysql/ndb_schema (UPDATED,USE_WRITE) SQL verification on all nodes (ndb1–ndb4), after the incident window: Command executed: for n in ndb1 ndb2 ndb3 ndb4; do docker exec -i $n bash -lc "mysql -uroot -e \" SELECT @@hostname, @@version, @@ndb_version; SHOW TABLES FROM mysql LIKE 'ndb_%'; SHOW CREATE TABLE mysql.ndb_schema\G SELECT COUNT(*) AS ndb_schema_rows FROM mysql.ndb_schema; \"" done Observed output (consistent on all nodes): SHOW TABLES FROM mysql LIKE 'ndb_%'; returns only: ndb_apply_status ndb_binlog_index SHOW CREATE TABLE mysql.ndb_schema fails with: ERROR 1146 (42S02): Table 'mysql.ndb_schema' doesn't exist How to repeat: Deploy MySQL NDB Cluster with 1 mgmd and 4 nodes running both ndbd + mysqld. Create and use several NDB tables (e.g., under schema mytest) and run concurrent DDL/DML workload. Trigger or encounter a cluster restart/instability (e.g., data node non-response/arbitration issues; mgmd stop/start). After restart, run the SQL verification loop above on each mysqld node. Observe mysql.ndb_schema missing (1146) on all nodes, while mysqld logs show reinstall/create attempts for mysql.ndb_schema. Suggested fix: Investigate schema distribution / DD recovery logic after cluster restart, especially around util tables recovery and the reinstall flow for mysql.ndb_schema. Ensure that: mysql.ndb_schema creation is durable and results in a visible table object in MySQL DD, recovery handles the case where REPL$mysql/ndb_schema event already exists without leaving partial state, util tables recovery does not silently leave the system without required schema distribution tables.