MySQL Bugs: #42232: Cluster to cluster replication is not setting auto

Bug #42232	Cluster to cluster replication is not setting auto_increment on slave cluster
Submitted:	21 Jan 2009 5:48	Modified:	13 Apr 2009 16:35
Reporter:	Tim Lokot	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Replication	Severity:	S1 (Critical)
Version:	6.3.20	OS:	Linux (CentOS 5.2 64 bit)
Assigned to:	Martin Skold	CPU Architecture:	Any
Tags:	auto_increment, cluster, replication

Description:
I have a 3 node cluster (1 x management node, 2 x data/query node) setup with a geographically replicated system setup exactly the same way with the same software/hardware versions etc.

In normal operation, one system is the production system, and the other is our DR system.  The two clusters use row based replication between one data node at each site to keep them in sync.

This all works perfectly until a new row is inserted into cluster one, which causes the row to be replicated to cluster two, BUT, it doesn't increase the internal auto_increment counter on the table in cluster 2 even though the record data is replicated correctly.  Cluster 1 is operating as expected.

This causes issues because when we switch the roles of master/slave clusters, we cant insert any data into the new master cluster because the auto_increment counters aren't starting from max(id)+1, they are effectively counting from 1 on a fresh cluster sync.

PS. I'm aware that we shouldn't be inserting data into the slave cluster, but still behaves that way after you take it out of slave mode.

How to repeat:
1. Setup two separate MySQL Cluster installs and set one to replicate to the other as per the instructions located here: http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-replication-starting.html

2. Create a table on cluster 1 and watch it replicate to cluster 2

CREATE TABLE `loguser` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `userAccountId` int(4) unsigned NOT NULL DEFAULT '0',
  `authActionId` int(4) unsigned NOT NULL DEFAULT '0',
  `authDomainId` int(4) unsigned NOT NULL DEFAULT '0',
  `active` tinyint(1) NOT NULL DEFAULT '1',
  `created` datetime NOT NULL DEFAULT '0001-01-01 00:00:00',
  `createdById` int(4) unsigned NOT NULL DEFAULT '0',
  `updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `updatedById` int(4) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `userAccountId` (`userAccountId`),
  KEY `authActionId` (`authActionId`),
  KEY `authDomainId` (`authDomainId`),
  KEY `createdById` (`createdById`),
  KEY `updatedById` (`updatedById`)
) ENGINE=ndbcluster DEFAULT CHARSET=utf8

3. Insert a few records into the table in cluster 1 (in my setup I have about 189 records at the moment) - the data should replicate to cluster 1

4. On cluster 2 type "select max(id) from loguser;" and then "show table status where Name='loguser';" and it should show a difference between the auto_increment value and the max(id) value from the first statement

5. If you try and insert a record into the loguser table on cluster 2, it will throw an error about a duplicate primary key because that key has already been taken when the data was replicated

Suggested fix:
Either set the auto_increment table property to max(id)+1 as the data is replicated, or when data is inserted into a table, it should bump the auto_increment counter up (like it does in MyISAM tables) to the next available number if the current number is less than the highest number available.

End of the day all I want is two clusters that I can switch between master/slave without having to worry about inconsistencies in the meta data or the table records.

Verified, the tables auto_increment attribute doesn't seem to get updated in cluster when receiving row based log events.

See also bug #33365

Smaller test case, using MySQL Cluster 6.3.21:

MySQL Master:

mysql_m> CREATE TABLE t1 (id INT NOT NULL AUTO_INCREMENT KEY) ENGINE=ndb;
mysql_m> INSERT INTO t1 () VALUES (),(),();

shell_m> ndb_select_all SYSTAB_0 -d sys | egrep -v "^4"
SYSKEY_0        NEXTID
8       4

8 is the table ID for the index, gathered with ndb_show_tables

MySQL Slave:
mysql_s> SELECT * FROM t1;
+----+
| id |
+----+
|  3 |
|  1 |
|  2 |
+----+

shell_s> ndb_select_all SYSTAB_0 -d sys | egrep -v "^4294"
SYSKEY_0        NEXTID
8       1

So it is indeed not updating on the Slave side.

Silly workaround, on slave, to get it going:
mysql_s> SELECT @nextid:=MAX(id)+1 FROM t1; INSERT INTO t1 (id) VALUES (@nextid); DELETE FROM t1 WHERE id = @nextid;

BTW, the auto_inc doesn't have to be on the primary key.

Well, apparently, this got fixed in MySQL Cluster 6.3.22 (from bzr), tested using a build done on Jan 21, and today.

Bug #33365 is proably a duplicate of this one?

Correct version for my small test case above is 6.3.20. I made a typo there, in 6.3.21 it actually works OK.

So upgrading both clusters to 6.3.21 will fix the issue?  Any idea when that version will be made available for download?  6.3.20 is still the latest publically available release on mysql.com.  My timeframe for getting this in place is getting to the point of critical (ie. less than one week) and it would help with our release planning if I had an estimate to report back to my boss.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65338

2812 Martin Skold	2009-02-05
       Bug #42232 Cluster to cluster replication is not setting auto_increment
      on slave cluster: Added test case

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65775

2836 Martin Skold	2009-02-10 [merge]
      Merge

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65920

3252 Martin Skold	2009-02-11 [merge]
      Merge

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65957

3257 Martin Skold	2009-02-11 [merge]
      Merge

Pushed into 5.1.31-ndb-6.4.3 (revid:martin.skold@mysql.com-20090211204523-03nx13fjekybwez2) (version source revid:martin.skold@mysql.com-20090211204523-03nx13fjekybwez2) (merge vers: 5.1.31-ndb-6.4.3) (pib:6)

Pushed into 5.1.31-ndb-6.2.17 (revid:martin.skold@mysql.com-20090210162313-g4ou3ovzoyoljaej) (version source revid:martin.skold@mysql.com-20090205132746-yyafji0e5f5tymoa) (merge vers: 5.1.31-ndb-6.2.17) (pib:6)

Pushed into 5.1.31-ndb-6.3.23 (revid:martin.skold@mysql.com-20090210163054-bcupy4stktagp5rt) (version source revid:martin.skold@mysql.com-20090210163054-bcupy4stktagp5rt) (merge vers: 5.1.31-ndb-6.3.23) (pib:6)

Documented bugfix in the NDB-6.2.17, 6.3.23, and 6.4.3 changelogs as follows:

        When replicating between MySQL Clusters,
        auto_increment was not set properly on the
        slave cluster.