Bug #112774 Metadata: Failed to submit table 'mysql.ndb_apply_status' for synchronization
Submitted: 19 Oct 2023 11:43 Modified: 9 Jan 2024 10:18
Reporter: Marek Kretkiewicz Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:8.0.34 OS:Red Hat
Assigned to: CPU Architecture:x86

[19 Oct 2023 11:43] Marek Kretkiewicz
Description:
It looks, that it's harmless issue, however DB quite freeqiently writes to logs information like that:

Metadata: Failed to submit table 'mysql.ndb_apply_status' for synchronization.

table exists, and is being updated with proper data, we restored DB from backup, to specific point in time, switched replication channels with this table, no problem with data and replication was noticed.

However we can see ~1440 such lines per day, so it's a lot.

table is ndb engine, maybe issue is that mysqld tries to submit new table with innodb engine, but fails? But ndb engine is allowed for this table as documentation says.

How to repeat:
Please deploy ndb engine from mysql-cluster-8.0.34-linux-glibc2.28-x86_64.tar.gz package. We do have 2 servers with ndbd, and 2 servers with mysqld and ndb_mgmd on each, so 2 mysqld and ndb_mgmd processes in a cluster.
[20 Oct 2023 3:05] MySQL Verification Team
Hi,

Have you configured your ndb-cluster for replication? ndb_apply_status table exist on cluster replicas and since you say you have only one cluster (so no replicas) I do not see where do you get this table from. Did you have at a time cluster replication?

Maybe you restored this by mistake? If you look at documentation:
https://dev.mysql.com/doc/mysql-cluster-excerpt/8.0/en/mysql-cluster-replication-schema.ht...

[quote]
ndb_apply_status is used to record which epoch transactions have been replicated and applied to a replica cluster from an upstream source. This information is captured in an NDB online backup, but (by design) it is not restored by ndb_restore. In some cases, it can be helpful to restore this information for use in new setups; beginning with NDB 8.0.29, you can do this by invoking ndb_restore with the --with-apply-status option. See the description of the option for more information. 
[/quote]

maybe you restored it with --with-apply-status ?
[20 Oct 2023 3:08] MySQL Verification Team
if you look at: https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster-replication-issues.html

you can change the engine of this table to myisam
[20 Oct 2023 8:03] Marek Kretkiewicz
Hi.
Sorry wasn't fully clear.

Of course we do have 2 clusters in master-master configuration.
Replication works fine without any problem, I can see changes in ndb_apply_status after replication action, but despite of that this info can be seen.
I will check your 2nd update and similar case.
[20 Oct 2023 8:13] Marek Kretkiewicz
Just notice, that it wasn't a case, but link to documentation which I've alreade read.

May I ask, why you recommend to change to MyISAM? As added before, I do have master-master replication, and on documentation can be found:

"
This can be done by issuing a statement such as ALTER TABLE mysql.ndb_apply_status ENGINE=MyISAM on the replica. It is safe to do this when using a storage engine other than NDB on the replica, since you do not need to worry about keeping multiple replicas synchronized.
"

But I do have ndb engine. 
And also:

"Change the storage engine used for the mysql.ndb_apply_status table.  Causing this table to use an engine that does not handle its own binary logging can also eliminate the conflict. This can be done by issuing a statement such as ALTER TABLE mysql.ndb_apply_status ENGINE=MyISAM on the replica. It is safe to do this when using a storage engine other than NDB on the replica, since you do not need to worry about keeping multiple replicas synchronized."

But our replica is also master, so we are procesing its own binary logging.
[20 Oct 2023 8:22] Marek Kretkiewicz
and one important information:
this info is being displayed since the first start of the cluster - do not remember was it just after I set up replication, or even before it.

I set up this environment with ansible playbook, and automatically put ndbengine as default for mysqld, before mysqld was initialized. Initialization was done later, when this parameter was already set.
[20 Oct 2023 18:22] MySQL Verification Team
Hi,

> Of course we do have 2 clusters in master-master configuration.

That now makes sense as you said you have 2 data nodes, 2 mgm nodes and 2 sql nodes, you did not mention the other cluster. Can you share full config from both?

Do you see same warnings on both clusters or only one?

Did you recently did ndb_restore and then this started filling the logs?
[20 Oct 2023 18:24] MySQL Verification Team
Hi,

> May I ask, why you recommend to change to MyISAM?

I do not!

It does make sense on replica (slave) that does not use ndbcluster but you replicate cluster to innodb tables. Or it could solve issue of filling a log on a non-replication system - and you did not say you have replication of clusters :D you only said you have one cluster. Since you have cluster replication (2 clusters), no, you should not myisam the ndb_... tables :)
[20 Oct 2023 18:28] MySQL Verification Team
> I set up this environment with ansible playbook, and automatically put ndbengine as default for mysqld, before mysqld was initialized. Initialization was done later, when this parameter was already set.

I do not have experiendce with ansible but it might messed something up with configuration. Does it use MCM or it runs cluster "manually"? How does it setup replication, does it backup and then ndb_restore on second server or it is always two virgin servers that you just configure to master-master while empty and then start using them? ndb_restore can mess up this table especially in master-master

please upload your full config from both clusters so I can check, also, as I wrote few min ago, it is important if this is issue on both clusters or only on one

all best
[30 Oct 2023 10:11] Marek Kretkiewicz
Cluster is community without MCM.
Replication was set up manually, on empty cluster without any data at the beginning and later with change master like this one:

CHANGE MASTER TO
MASTER_HOST='cluster_2',
MASTER_USER='replication_user',
MASTER_PASSWORD='replication_pass',
MASTER_LOG_FILE='binlog.000001',
MASTER_LOG_POS=123;

Of course users and proper grants were done at the beginning.

So no ndb_restore was made. We set up new clusters, without any data inside. And just after that this entry was seen in logs.

What is interesting I've set up 4 different clusters, and on all we do have the same situation. I was doing with same procedure for many time, and never had such warning. 

Below you have config.ini and my.cnf. If you want something more I will share it with you.

[root@cluster_1 ~ ] # cat /data/mysql/mysql-cluster/config.ini
[ndbd default]
# Options affecting ndbd processes on all data nodes:
NoOfReplicas=2    # Number of replicas
DataMemory=12G    # How much memory to allocate for data storage
IndexMemory=4G
MaxNoOfConcurrentOperations=256000
MaxNoOfOrderedIndexes=1024
MaxNoOfAttributes=50000
RedoBuffer=64M
MaxNoOfConcurrentScans=500
MaxNoOfTables=512
MaxNoOfTriggers=2048
MaxNoOfUniqueHashIndexes=2048
FragmentLogFileSize=256M
BackupDataDir=/data/backup/mysql/ndbd

[ndb_mgmd]
# Management process options:
HostName=192.168.1.16         # Hostname or IP address of MGM node
NodeId=49
DataDir=/data/mysql/mysql-cluster  # Directory for MGM node log files

[ndb_mgmd]
# Management process options:
HostName=192.168.1.17          # Hostname or IP address of MGM node
NodeId=50
DataDir=/data/mysql/mysql-cluster  # Directory for MGM node log files

[ndbd]
# Options for data node "A":
                                # (one [ndbd] section per data node)
HostName=192.168.1.18          # Hostname or IP address
NodeId=1                       # Node ID for this data node
DataDir=/data/mysql/data        # Directory for this data node's data files

[ndbd]
# Options for data node "B":
HostName=192.168.1.19         # Hostname or IP address
NodeId=2                        # Node ID for this data node
DataDir=/data/mysql/data        # Directory for this data node's data files

[mysqld]
# SQL node options:
HostName=192.168.1.16          # Hostname or IP address
NodeId=51

[mysqld]
# SQL node options:
HostName=192.168.1.17        # Hostname or IP address
NodeId=52
                                # (additional mysqld connections can be
                                # specified for this node for various
                                # purposes such as running ndb_restore)

[api]
NodeId=53

[api]
NodeId=54

[api]
NodeId=55

[api]
NodeId=56

[api]
NodeId=57

[api]
NodeId=58

[api]
NodeId=59

[jtendo@cluster_1 ~ ] $ cat /etc/my.cnf
[mysqld]
# Options for mysqld process:
ndbcluster                      # run NDB storage engine
basedir=/data/mysql/mysql
datadir=/data/mysql/data
default-storage-engine=ndbcluster
ndb-log-bin=ON
slave-skip-errors=13119
max-heap-table-size=1677721600
tmp-table-size=1677721600
server_id=51
auto_increment_increment=10
auto_increment_offset=5
#binlog-ignore-db=mysql
skip-replica-start=ON

[mysql_cluster]
# Options for NDB Cluster processes:
ndb-connectstring=192.168.1.16,192.168.1.17   # location of management server

[root@cluster_2 ~ ] # cat /data/mysql/mysql-cluster/config.ini
[ndbd default]
# Options affecting ndbd processes on all data nodes:
NoOfReplicas=2    # Number of replicas
DataMemory=12G    # How much memory to allocate for data storage
IndexMemory=4G
MaxNoOfConcurrentOperations=256000
MaxNoOfOrderedIndexes=1024
MaxNoOfAttributes=50000
RedoBuffer=64M
MaxNoOfConcurrentScans=500
MaxNoOfTables=512
MaxNoOfTriggers=2048
MaxNoOfUniqueHashIndexes=2048
FragmentLogFileSize=256M
BackupDataDir=/data/backup/mysql/ndbd

[ndb_mgmd]
# Management process options:
HostName=192.168.1.1         # Hostname or IP address of MGM node
NodeId=49
DataDir=/data/mysql/mysql-cluster  # Directory for MGM node log files

[ndb_mgmd]
# Management process options:
HostName=192.168.1.2          # Hostname or IP address of MGM node
NodeId=50
DataDir=/data/mysql/mysql-cluster  # Directory for MGM node log files

[ndbd]
# Options for data node "A":
                                # (one [ndbd] section per data node)
HostName=192.168.1.3          # Hostname or IP address
NodeId=1                       # Node ID for this data node
DataDir=/data/mysql/data        # Directory for this data node's data files

[ndbd]
# Options for data node "B":
HostName=192.168.1.4         # Hostname or IP address
NodeId=2                        # Node ID for this data node
DataDir=/data/mysql/data        # Directory for this data node's data files

[mysqld]
# SQL node options:
HostName=192.168.1.1          # Hostname or IP address
NodeId=51

[mysqld]
# SQL node options:
HostName=192.168.1.2        # Hostname or IP address
NodeId=52
                                # (additional mysqld connections can be
                                # specified for this node for various
                                # purposes such as running ndb_restore)

[api]
NodeId=53

[api]
NodeId=54

[api]
NodeId=55

[api]
NodeId=56

[api]
NodeId=57

[api]
NodeId=58

[api]
NodeId=59

[jtendo@cluster_2~ ] $ cat /etc/my.cnf
[mysqld]
# Options for mysqld process:
ndbcluster                      # run NDB storage engine
basedir=/data/mysql/mysql
datadir=/data/mysql/data
default-storage-engine=ndbcluster
ndb-log-bin=ON
slave-skip-errors=13119
max-heap-table-size=1677721600
tmp-table-size=1677721600
server_id=61
auto_increment_increment=10
auto_increment_offset=6
#binlog-ignore-db=mysql
skip-replica-start=ON

[mysql_cluster]
# Options for NDB Cluster processes:
ndb-connectstring=192.168.1.1,192.168.1.2     # location of management server
[31 Oct 2023 0:30] MySQL Verification Team
Hi,

This is weird, I cannot reproduce this?! I setup two blank clusters, setup replication, start it, start adding data .. everything works ok, none of this in the logs.

Not sure I understood properly this about four clusters. You said you set-upped four clusters. Did you have same problem in logs on all four of them or only on this one and on other three logs are ok?

Thanks
[31 Oct 2023 10:15] Marek Kretkiewicz
On all, this is why I have created this case, as in my opinion it's something in code.
Procedure used by me looks like that:
- pull tar with binaries
package_name: "mysql-cluster-8.0.34-linux-glibc2.28-x86_64.tar.gz"

- untar it on all hosts with ndbd, ndb_mgm and mysqld (ndb_mgm and mysqld on the same hosts).
- copy config.ini for ndb_mgmd with proper values
- set systemd file for ndb_mgmd
- start ndb_mgmd
- prepare my.cnf for ndbd
- create systemd file for ndbd
- start ndbd
- copy my.cnf for mysqld
- create systemd for mysqld (serwis + env.conf)
- /data/mysql/mysql/bin/mysqld --initialize
- staty mysqld + create user with grants for replication

later manually set up replication with provided commands.
[31 Oct 2023 16:39] MySQL Verification Team
Hi,

I was using some automated installation, lemme try fully manual and I'll get back

thanks
[6 Nov 2023 16:59] Przemysław Ołtarzewski
Hello,

I'm experiencing the same problem with MySQL Cluster 8.0.35, installed from official RPM packages.

I couldn't find any detailed description of this synchronization problem in the docs, so I've ended up downloading sources and doing some code analysis to diagnose this issue.

Since detailed results turned out to be too long to fit in a single comment, I've attached full text as `001-ndb_apply_status-sync_failure.txt`.

It looks like `Ndb_metadata_change_monitor::detect_table_changes_in_schema` incorrectly detects that `ndb_apply_status` table is present in NDB but not in DD (or the other way around).

Then it attempts to synchronize this using `ndbcluster_binlog_check_table_async`, but its implementation explicitly returns `false` for `ndb_apply_status` table. There's even a comment in source stating that `ndb_apply_status` is a special case.

It does look like there's some inconsistency in that general area of code.
[6 Nov 2023 16:59] Przemysław Ołtarzewski
Detailed code analysis for ndb_apply_status table sync errors in logs.

Attachment: 001-ndb_apply_status-sync_failure.txt (text/plain), 7.64 KiB.

[6 Nov 2023 18:49] Przemysław Ołtarzewski
After digging some more into this, I think I may have found the inconsistency.

`Ndb_metadata_change_monitor::detect_table_changes_in_schema` implementation makes two lists of tables in given schema:
- `ndb_tables_in_NDB` using `storage/ndb/plugin/ndb_ndbapi_util.cc:254 :: bool ndb_get_table_names_in_schema`
- `ndb_tables_in_DD` using `storage/ndb/plugin/ndb_dd_client.cc:996 :: bool Ndb_dd_client::get_table_names_in_schema`

Looking at both implementations, `ndb_get_table_names_in_schema` contains a condition that explicitly filters out `ndb_apply_status` table:
```
...
    if (schema_name == "mysql" &&
        (strcmp(elmt.name, "ndb_schema") == 0 ||
         strcmp(elmt.name, "ndb_schema_result") == 0 ||
         strcmp(elmt.name, "ndb_apply_status") == 0 ||
         strcmp(elmt.name, "ndb_sql_metadata") == 0 ||
         strcmp(elmt.name, "ndb_index_stat_head") == 0 ||
         strcmp(elmt.name, "ndb_index_stat_sample") == 0)) {
      // Skip NDB utility tables. They are marked as hidden in the DD and are
      // specially handled by the binlog thread.
      continue;
    }
...
```

However, `Ndb_dd_client::get_table_names_in_schema` doesn't seem to perform this kind of filtering - it just retrieves tables from schema by engine name:
```
  // Fetch NDB table names
  std::vector<dd::String_type> ndb_table_names;
  if (m_client->fetch_schema_table_names_by_engine(schema, "ndbcluster",
                                                   &ndb_table_names)) {
    return false;
  }
    [... lock table and transform name to lower case ...]
    ndb_tables->insert(table_name);
  }
```

This would create a situation, where for `mysql` schema `ndb_tables_in_NDB` list *will not* contain `ndb_apply_status`, but `ndb_tables_in_DD` *will*.

Subsequent logic in `Ndb_metadata_change_monitor::detect_table_changes_in_schema` subtracts the former list from the latter, leaving `ndb_apply_status` still in `ndb_tables_in_DD` list when hitting the last `for` block. Then `ndbcluster_binlog_check_table_async` is invoked on `ndb_apply_status`, which will always return `false` by design and cause an error to be logged.

Provided that the binlog thread handles `ndb_apply_status` table internally, including any DD synchronization, this error should be harmless. However it reveals that parts of code assume that `ndb_apply_status` is hidden in DD, and parts the exact opposite.

This should be fixed to prevent this inconsistency is propagated to more sensitive parts of code, potentially leading to more severe bugs in future releases.
[6 Nov 2023 19:08] Przemysław Ołtarzewski
Oh, and just to point this out explicitly - in practice, at least for me after installing from official RPM packages, `ndb_apply_status` is not hidden in DD.

It is reported when running `show tables` on `mysql` schema, can be at least selected from and corresponding row is present in `information_schema.tables` view. 

Unlike any other hidden tables described in MySQL docs sections related to DD.
[6 Nov 2023 20:57] MySQL Verification Team
Hi,

Thanks for the code analysis.

Weirdly, I did reproduce this with RPM build but not with TGZ.

Thank you
[8 Nov 2023 10:44] Magnus Blåudd
Posted by developer:
 
Fix for this problem has been pushed as BUG#35925503 to 8.0.36. Closing as duplicate.
[8 Nov 2023 10:49] Magnus Blåudd
Sorry for the confusion with MyISAM. The ndb_apply_status table should be in NDB and contains information regarding what has been replicated into that cluster.

https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster-replication-schema.html#ndb-replicat...