Bug #112774 | Metadata: Failed to submit table 'mysql.ndb_apply_status' for synchronization | ||
---|---|---|---|
Submitted: | 19 Oct 2023 11:43 | Modified: | 9 Jan 2024 10:18 |
Reporter: | Marek Kretkiewicz | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S3 (Non-critical) |
Version: | 8.0.34 | OS: | Red Hat |
Assigned to: | CPU Architecture: | x86 |
[19 Oct 2023 11:43]
Marek Kretkiewicz
[20 Oct 2023 3:05]
MySQL Verification Team
Hi, Have you configured your ndb-cluster for replication? ndb_apply_status table exist on cluster replicas and since you say you have only one cluster (so no replicas) I do not see where do you get this table from. Did you have at a time cluster replication? Maybe you restored this by mistake? If you look at documentation: https://dev.mysql.com/doc/mysql-cluster-excerpt/8.0/en/mysql-cluster-replication-schema.ht... [quote] ndb_apply_status is used to record which epoch transactions have been replicated and applied to a replica cluster from an upstream source. This information is captured in an NDB online backup, but (by design) it is not restored by ndb_restore. In some cases, it can be helpful to restore this information for use in new setups; beginning with NDB 8.0.29, you can do this by invoking ndb_restore with the --with-apply-status option. See the description of the option for more information. [/quote] maybe you restored it with --with-apply-status ?
[20 Oct 2023 3:08]
MySQL Verification Team
if you look at: https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster-replication-issues.html you can change the engine of this table to myisam
[20 Oct 2023 8:03]
Marek Kretkiewicz
Hi. Sorry wasn't fully clear. Of course we do have 2 clusters in master-master configuration. Replication works fine without any problem, I can see changes in ndb_apply_status after replication action, but despite of that this info can be seen. I will check your 2nd update and similar case.
[20 Oct 2023 8:13]
Marek Kretkiewicz
Just notice, that it wasn't a case, but link to documentation which I've alreade read. May I ask, why you recommend to change to MyISAM? As added before, I do have master-master replication, and on documentation can be found: " This can be done by issuing a statement such as ALTER TABLE mysql.ndb_apply_status ENGINE=MyISAM on the replica. It is safe to do this when using a storage engine other than NDB on the replica, since you do not need to worry about keeping multiple replicas synchronized. " But I do have ndb engine. And also: "Change the storage engine used for the mysql.ndb_apply_status table. Causing this table to use an engine that does not handle its own binary logging can also eliminate the conflict. This can be done by issuing a statement such as ALTER TABLE mysql.ndb_apply_status ENGINE=MyISAM on the replica. It is safe to do this when using a storage engine other than NDB on the replica, since you do not need to worry about keeping multiple replicas synchronized." But our replica is also master, so we are procesing its own binary logging.
[20 Oct 2023 8:22]
Marek Kretkiewicz
and one important information: this info is being displayed since the first start of the cluster - do not remember was it just after I set up replication, or even before it. I set up this environment with ansible playbook, and automatically put ndbengine as default for mysqld, before mysqld was initialized. Initialization was done later, when this parameter was already set.
[20 Oct 2023 18:22]
MySQL Verification Team
Hi, > Of course we do have 2 clusters in master-master configuration. That now makes sense as you said you have 2 data nodes, 2 mgm nodes and 2 sql nodes, you did not mention the other cluster. Can you share full config from both? Do you see same warnings on both clusters or only one? Did you recently did ndb_restore and then this started filling the logs?
[20 Oct 2023 18:24]
MySQL Verification Team
Hi, > May I ask, why you recommend to change to MyISAM? I do not! It does make sense on replica (slave) that does not use ndbcluster but you replicate cluster to innodb tables. Or it could solve issue of filling a log on a non-replication system - and you did not say you have replication of clusters :D you only said you have one cluster. Since you have cluster replication (2 clusters), no, you should not myisam the ndb_... tables :)
[20 Oct 2023 18:28]
MySQL Verification Team
> I set up this environment with ansible playbook, and automatically put ndbengine as default for mysqld, before mysqld was initialized. Initialization was done later, when this parameter was already set. I do not have experiendce with ansible but it might messed something up with configuration. Does it use MCM or it runs cluster "manually"? How does it setup replication, does it backup and then ndb_restore on second server or it is always two virgin servers that you just configure to master-master while empty and then start using them? ndb_restore can mess up this table especially in master-master please upload your full config from both clusters so I can check, also, as I wrote few min ago, it is important if this is issue on both clusters or only on one all best
[30 Oct 2023 10:11]
Marek Kretkiewicz
Cluster is community without MCM. Replication was set up manually, on empty cluster without any data at the beginning and later with change master like this one: CHANGE MASTER TO MASTER_HOST='cluster_2', MASTER_USER='replication_user', MASTER_PASSWORD='replication_pass', MASTER_LOG_FILE='binlog.000001', MASTER_LOG_POS=123; Of course users and proper grants were done at the beginning. So no ndb_restore was made. We set up new clusters, without any data inside. And just after that this entry was seen in logs. What is interesting I've set up 4 different clusters, and on all we do have the same situation. I was doing with same procedure for many time, and never had such warning. Below you have config.ini and my.cnf. If you want something more I will share it with you. [root@cluster_1 ~ ] # cat /data/mysql/mysql-cluster/config.ini [ndbd default] # Options affecting ndbd processes on all data nodes: NoOfReplicas=2 # Number of replicas DataMemory=12G # How much memory to allocate for data storage IndexMemory=4G MaxNoOfConcurrentOperations=256000 MaxNoOfOrderedIndexes=1024 MaxNoOfAttributes=50000 RedoBuffer=64M MaxNoOfConcurrentScans=500 MaxNoOfTables=512 MaxNoOfTriggers=2048 MaxNoOfUniqueHashIndexes=2048 FragmentLogFileSize=256M BackupDataDir=/data/backup/mysql/ndbd [ndb_mgmd] # Management process options: HostName=192.168.1.16 # Hostname or IP address of MGM node NodeId=49 DataDir=/data/mysql/mysql-cluster # Directory for MGM node log files [ndb_mgmd] # Management process options: HostName=192.168.1.17 # Hostname or IP address of MGM node NodeId=50 DataDir=/data/mysql/mysql-cluster # Directory for MGM node log files [ndbd] # Options for data node "A": # (one [ndbd] section per data node) HostName=192.168.1.18 # Hostname or IP address NodeId=1 # Node ID for this data node DataDir=/data/mysql/data # Directory for this data node's data files [ndbd] # Options for data node "B": HostName=192.168.1.19 # Hostname or IP address NodeId=2 # Node ID for this data node DataDir=/data/mysql/data # Directory for this data node's data files [mysqld] # SQL node options: HostName=192.168.1.16 # Hostname or IP address NodeId=51 [mysqld] # SQL node options: HostName=192.168.1.17 # Hostname or IP address NodeId=52 # (additional mysqld connections can be # specified for this node for various # purposes such as running ndb_restore) [api] NodeId=53 [api] NodeId=54 [api] NodeId=55 [api] NodeId=56 [api] NodeId=57 [api] NodeId=58 [api] NodeId=59 [jtendo@cluster_1 ~ ] $ cat /etc/my.cnf [mysqld] # Options for mysqld process: ndbcluster # run NDB storage engine basedir=/data/mysql/mysql datadir=/data/mysql/data default-storage-engine=ndbcluster ndb-log-bin=ON slave-skip-errors=13119 max-heap-table-size=1677721600 tmp-table-size=1677721600 server_id=51 auto_increment_increment=10 auto_increment_offset=5 #binlog-ignore-db=mysql skip-replica-start=ON [mysql_cluster] # Options for NDB Cluster processes: ndb-connectstring=192.168.1.16,192.168.1.17 # location of management server [root@cluster_2 ~ ] # cat /data/mysql/mysql-cluster/config.ini [ndbd default] # Options affecting ndbd processes on all data nodes: NoOfReplicas=2 # Number of replicas DataMemory=12G # How much memory to allocate for data storage IndexMemory=4G MaxNoOfConcurrentOperations=256000 MaxNoOfOrderedIndexes=1024 MaxNoOfAttributes=50000 RedoBuffer=64M MaxNoOfConcurrentScans=500 MaxNoOfTables=512 MaxNoOfTriggers=2048 MaxNoOfUniqueHashIndexes=2048 FragmentLogFileSize=256M BackupDataDir=/data/backup/mysql/ndbd [ndb_mgmd] # Management process options: HostName=192.168.1.1 # Hostname or IP address of MGM node NodeId=49 DataDir=/data/mysql/mysql-cluster # Directory for MGM node log files [ndb_mgmd] # Management process options: HostName=192.168.1.2 # Hostname or IP address of MGM node NodeId=50 DataDir=/data/mysql/mysql-cluster # Directory for MGM node log files [ndbd] # Options for data node "A": # (one [ndbd] section per data node) HostName=192.168.1.3 # Hostname or IP address NodeId=1 # Node ID for this data node DataDir=/data/mysql/data # Directory for this data node's data files [ndbd] # Options for data node "B": HostName=192.168.1.4 # Hostname or IP address NodeId=2 # Node ID for this data node DataDir=/data/mysql/data # Directory for this data node's data files [mysqld] # SQL node options: HostName=192.168.1.1 # Hostname or IP address NodeId=51 [mysqld] # SQL node options: HostName=192.168.1.2 # Hostname or IP address NodeId=52 # (additional mysqld connections can be # specified for this node for various # purposes such as running ndb_restore) [api] NodeId=53 [api] NodeId=54 [api] NodeId=55 [api] NodeId=56 [api] NodeId=57 [api] NodeId=58 [api] NodeId=59 [jtendo@cluster_2~ ] $ cat /etc/my.cnf [mysqld] # Options for mysqld process: ndbcluster # run NDB storage engine basedir=/data/mysql/mysql datadir=/data/mysql/data default-storage-engine=ndbcluster ndb-log-bin=ON slave-skip-errors=13119 max-heap-table-size=1677721600 tmp-table-size=1677721600 server_id=61 auto_increment_increment=10 auto_increment_offset=6 #binlog-ignore-db=mysql skip-replica-start=ON [mysql_cluster] # Options for NDB Cluster processes: ndb-connectstring=192.168.1.1,192.168.1.2 # location of management server
[31 Oct 2023 0:30]
MySQL Verification Team
Hi, This is weird, I cannot reproduce this?! I setup two blank clusters, setup replication, start it, start adding data .. everything works ok, none of this in the logs. Not sure I understood properly this about four clusters. You said you set-upped four clusters. Did you have same problem in logs on all four of them or only on this one and on other three logs are ok? Thanks
[31 Oct 2023 10:15]
Marek Kretkiewicz
On all, this is why I have created this case, as in my opinion it's something in code. Procedure used by me looks like that: - pull tar with binaries package_name: "mysql-cluster-8.0.34-linux-glibc2.28-x86_64.tar.gz" - untar it on all hosts with ndbd, ndb_mgm and mysqld (ndb_mgm and mysqld on the same hosts). - copy config.ini for ndb_mgmd with proper values - set systemd file for ndb_mgmd - start ndb_mgmd - prepare my.cnf for ndbd - create systemd file for ndbd - start ndbd - copy my.cnf for mysqld - create systemd for mysqld (serwis + env.conf) - /data/mysql/mysql/bin/mysqld --initialize - staty mysqld + create user with grants for replication later manually set up replication with provided commands.
[31 Oct 2023 16:39]
MySQL Verification Team
Hi, I was using some automated installation, lemme try fully manual and I'll get back thanks
[6 Nov 2023 16:59]
Przemysław Ołtarzewski
Hello, I'm experiencing the same problem with MySQL Cluster 8.0.35, installed from official RPM packages. I couldn't find any detailed description of this synchronization problem in the docs, so I've ended up downloading sources and doing some code analysis to diagnose this issue. Since detailed results turned out to be too long to fit in a single comment, I've attached full text as `001-ndb_apply_status-sync_failure.txt`. It looks like `Ndb_metadata_change_monitor::detect_table_changes_in_schema` incorrectly detects that `ndb_apply_status` table is present in NDB but not in DD (or the other way around). Then it attempts to synchronize this using `ndbcluster_binlog_check_table_async`, but its implementation explicitly returns `false` for `ndb_apply_status` table. There's even a comment in source stating that `ndb_apply_status` is a special case. It does look like there's some inconsistency in that general area of code.
[6 Nov 2023 16:59]
Przemysław Ołtarzewski
Detailed code analysis for ndb_apply_status table sync errors in logs.
Attachment: 001-ndb_apply_status-sync_failure.txt (text/plain), 7.64 KiB.
[6 Nov 2023 18:49]
Przemysław Ołtarzewski
After digging some more into this, I think I may have found the inconsistency. `Ndb_metadata_change_monitor::detect_table_changes_in_schema` implementation makes two lists of tables in given schema: - `ndb_tables_in_NDB` using `storage/ndb/plugin/ndb_ndbapi_util.cc:254 :: bool ndb_get_table_names_in_schema` - `ndb_tables_in_DD` using `storage/ndb/plugin/ndb_dd_client.cc:996 :: bool Ndb_dd_client::get_table_names_in_schema` Looking at both implementations, `ndb_get_table_names_in_schema` contains a condition that explicitly filters out `ndb_apply_status` table: ``` ... if (schema_name == "mysql" && (strcmp(elmt.name, "ndb_schema") == 0 || strcmp(elmt.name, "ndb_schema_result") == 0 || strcmp(elmt.name, "ndb_apply_status") == 0 || strcmp(elmt.name, "ndb_sql_metadata") == 0 || strcmp(elmt.name, "ndb_index_stat_head") == 0 || strcmp(elmt.name, "ndb_index_stat_sample") == 0)) { // Skip NDB utility tables. They are marked as hidden in the DD and are // specially handled by the binlog thread. continue; } ... ``` However, `Ndb_dd_client::get_table_names_in_schema` doesn't seem to perform this kind of filtering - it just retrieves tables from schema by engine name: ``` // Fetch NDB table names std::vector<dd::String_type> ndb_table_names; if (m_client->fetch_schema_table_names_by_engine(schema, "ndbcluster", &ndb_table_names)) { return false; } [... lock table and transform name to lower case ...] ndb_tables->insert(table_name); } ``` This would create a situation, where for `mysql` schema `ndb_tables_in_NDB` list *will not* contain `ndb_apply_status`, but `ndb_tables_in_DD` *will*. Subsequent logic in `Ndb_metadata_change_monitor::detect_table_changes_in_schema` subtracts the former list from the latter, leaving `ndb_apply_status` still in `ndb_tables_in_DD` list when hitting the last `for` block. Then `ndbcluster_binlog_check_table_async` is invoked on `ndb_apply_status`, which will always return `false` by design and cause an error to be logged. Provided that the binlog thread handles `ndb_apply_status` table internally, including any DD synchronization, this error should be harmless. However it reveals that parts of code assume that `ndb_apply_status` is hidden in DD, and parts the exact opposite. This should be fixed to prevent this inconsistency is propagated to more sensitive parts of code, potentially leading to more severe bugs in future releases.
[6 Nov 2023 19:08]
Przemysław Ołtarzewski
Oh, and just to point this out explicitly - in practice, at least for me after installing from official RPM packages, `ndb_apply_status` is not hidden in DD. It is reported when running `show tables` on `mysql` schema, can be at least selected from and corresponding row is present in `information_schema.tables` view. Unlike any other hidden tables described in MySQL docs sections related to DD.
[6 Nov 2023 20:57]
MySQL Verification Team
Hi, Thanks for the code analysis. Weirdly, I did reproduce this with RPM build but not with TGZ. Thank you
[8 Nov 2023 10:44]
Magnus Blåudd
Posted by developer: Fix for this problem has been pushed as BUG#35925503 to 8.0.36. Closing as duplicate.
[8 Nov 2023 10:49]
Magnus Blåudd
Sorry for the confusion with MyISAM. The ndb_apply_status table should be in NDB and contains information regarding what has been replicated into that cluster. https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster-replication-schema.html#ndb-replicat...