Bug #42232 Cluster to cluster replication is not setting auto_increment on slave cluster
Submitted: 21 Jan 6:48 Modified: 13 Apr 18:35
Reporter: Tim Lokot
Status: Closed
Category:Server: ClusterRep Severity:S1 (Critical)
Version:6.3.20 OS:Linux (CentOS 5.2 64 bit)
Assigned to: Martin Skold Target Version:
Tags: auto_increment, replication, cluster
Triage: Needs Triage: D2 (Serious)

[21 Jan 6:48] Tim Lokot
Description:
I have a 3 node cluster (1 x management node, 2 x data/query node) setup with a
geographically replicated system setup exactly the same way with the same
software/hardware versions etc.

In normal operation, one system is the production system, and the other is our DR system.
 The two clusters use row based replication between one data node at each site to keep
them in sync.

This all works perfectly until a new row is inserted into cluster one, which causes the
row to be replicated to cluster two, BUT, it doesn't increase the internal auto_increment
counter on the table in cluster 2 even though the record data is replicated correctly. 
Cluster 1 is operating as expected.

This causes issues because when we switch the roles of master/slave clusters, we cant
insert any data into the new master cluster because the auto_increment counters aren't
starting from max(id)+1, they are effectively counting from 1 on a fresh cluster sync.

PS. I'm aware that we shouldn't be inserting data into the slave cluster, but still
behaves that way after you take it out of slave mode.

How to repeat:
1. Setup two separate MySQL Cluster installs and set one to replicate to the other as per
the instructions located here:
http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-replication-starting.html

2. Create a table on cluster 1 and watch it replicate to cluster 2

CREATE TABLE `loguser` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `userAccountId` int(4) unsigned NOT NULL DEFAULT '0',
  `authActionId` int(4) unsigned NOT NULL DEFAULT '0',
  `authDomainId` int(4) unsigned NOT NULL DEFAULT '0',
  `active` tinyint(1) NOT NULL DEFAULT '1',
  `created` datetime NOT NULL DEFAULT '0001-01-01 00:00:00',
  `createdById` int(4) unsigned NOT NULL DEFAULT '0',
  `updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `updatedById` int(4) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `userAccountId` (`userAccountId`),
  KEY `authActionId` (`authActionId`),
  KEY `authDomainId` (`authDomainId`),
  KEY `createdById` (`createdById`),
  KEY `updatedById` (`updatedById`)
) ENGINE=ndbcluster DEFAULT CHARSET=utf8

3. Insert a few records into the table in cluster 1 (in my setup I have about 189 records
at the moment) - the data should replicate to cluster 1

4. On cluster 2 type "select max(id) from loguser;" and then "show table status where
Name='loguser';" and it should show a difference between the auto_increment value and the
max(id) value from the first statement

5. If you try and insert a record into the loguser table on cluster 2, it will throw an
error about a duplicate primary key because that key has already been taken when the data
was replicated

Suggested fix:
Either set the auto_increment table property to max(id)+1 as the data is replicated, or
when data is inserted into a table, it should bump the auto_increment counter up (like it
does in MyISAM tables) to the next available number if the current number is less than the
highest number available.

End of the day all I want is two clusters that I can switch between master/slave without
having to worry about inconsistencies in the meta data or the table records.
[22 Jan 12:19] Hartmut Holzgraefe
Verified, the tables auto_increment attribute doesn't seem to get updated in cluster when
receiving row based log events.

See also bug #33365
[2 Feb 17:55] Geert Vanderkelen
Smaller test case, using MySQL Cluster 6.3.21:

MySQL Master:

mysql_m> CREATE TABLE t1 (id INT NOT NULL AUTO_INCREMENT KEY) ENGINE=ndb;
mysql_m> INSERT INTO t1 () VALUES (),(),();

shell_m> ndb_select_all SYSTAB_0 -d sys | egrep -v "^4"
SYSKEY_0        NEXTID
8       4

8 is the table ID for the index, gathered with ndb_show_tables

MySQL Slave:
mysql_s> SELECT * FROM t1;
+----+
| id |
+----+
|  3 |
|  1 |
|  2 |
+----+

shell_s> ndb_select_all SYSTAB_0 -d sys | egrep -v "^4294"
SYSKEY_0        NEXTID
8       1

So it is indeed not updating on the Slave side.

Silly workaround, on slave, to get it going:
mysql_s> SELECT @nextid:=MAX(id)+1 FROM t1; INSERT INTO t1 (id) VALUES (@nextid); DELETE
FROM t1 WHERE id = @nextid;

BTW, the auto_inc doesn't have to be on the primary key.
[4 Feb 13:09] Geert Vanderkelen
Well, apparently, this got fixed in MySQL Cluster 6.3.22 (from bzr), tested using a build
done on Jan 21, and today.
[4 Feb 13:15] Hartmut Holzgraefe
Bug #33365 is proably a duplicate of this one?
[4 Feb 13:30] Geert Vanderkelen
Correct version for my small test case above is 6.3.20. I made a typo there, in 6.3.21 it
actually works OK.
[4 Feb 23:30] Tim Lokot
So upgrading both clusters to 6.3.21 will fix the issue?  Any idea when that version will
be made available for download?  6.3.20 is still the latest publically available release
on mysql.com.  My timeframe for getting this in place is getting to the point of critical
(ie. less than one week) and it would help with our release planning if I had an estimate
to report back to my boss.
[5 Feb 14:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65338

2812 Martin Skold	2009-02-05
       Bug #42232 Cluster to cluster replication is not setting auto_increment
      on slave cluster: Added test case
[10 Feb 17:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65775

2836 Martin Skold	2009-02-10 [merge]
      Merge
[11 Feb 16:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65920

3252 Martin Skold	2009-02-11 [merge]
      Merge
[11 Feb 21:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65957

3257 Martin Skold	2009-02-11 [merge]
      Merge
[11 Feb 22:04] Bugs System
Pushed into 5.1.31-ndb-6.4.3
(revid:martin.skold@mysql.com-20090211204523-03nx13fjekybwez2) (version source
revid:martin.skold@mysql.com-20090211204523-03nx13fjekybwez2) (merge vers:
5.1.31-ndb-6.4.3) (pib:6)
[11 Feb 22:06] Bugs System
Pushed into 5.1.31-ndb-6.2.17
(revid:martin.skold@mysql.com-20090210162313-g4ou3ovzoyoljaej) (version source
revid:martin.skold@mysql.com-20090205132746-yyafji0e5f5tymoa) (merge vers:
5.1.31-ndb-6.2.17) (pib:6)
[11 Feb 22:07] Bugs System
Pushed into 5.1.31-ndb-6.3.23
(revid:martin.skold@mysql.com-20090210163054-bcupy4stktagp5rt) (version source
revid:martin.skold@mysql.com-20090210163054-bcupy4stktagp5rt) (merge vers:
5.1.31-ndb-6.3.23) (pib:6)
[13 Apr 18:35] Jon Stephens
Documented bugfix in the NDB-6.2.17, 6.3.23, and 6.4.3 changelogs as follows:

        When replicating between MySQL Clusters,
        auto_increment was not set properly on the
        slave cluster.