Bug #18271 Replication of auto_increment when master or slave table is using "NDB" engine
Submitted: 16 Mar 2006 7:56 Modified: 27 Nov 2006 16:24
Reporter: Magnus Blåudd Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.1.8 OS:Any (all)
Assigned to: CPU Architecture:Any

[16 Mar 2006 7:56] Magnus Blåudd
Description:
When a table with auto_incerement key is replicated with statement based replication to a NDB table on the slave, the auto_increment values does not become the same as on the master.

How to repeat:
CREATE TABLE t1 (
 a int unsigned not null auto_increment primary key,
 b int unsigned
) ENGINE=NDB;

CREATE TABLE t2 (
 a int unsigned not null auto_increment primary key,
 b int unsigned
) ENGINE=NDB;

INSERT INTO t1 VALUES (NULL, 0);
INSERT INTO t1 SELECT NULL, 0 FROM t1;

INSERT INTO t2 VALUES (NULL, 0), (NULL,1);

connection slave;
select * from t1;

Suggested fix:
Make the autoincrement values generated on the master become generated in a deterministic way so it can be deduced what should be inserted on the slave from the information in the binlog.

An INSERT statement with one or several rows should always start with allocating enough autoincrement's from NDB in 'start_bulk_insert' or the first time it calls 'update_auto_increment' for the INSERT. This will make all auto_increment's for the whole statement start with a known value and increase with "auto_increment_increment". It should also be possible to take "auto_increment_increment" into account when allocating the needed autoinc values  from NDB. So if we want to insert 3 rows and auto_increment_increment is 10, just allocate 30 numbers.

An "INSERT .. SELECT * FROM" is more complicated as it is not known in 'start_bulk_insert' or the first 'update_auto_increment', so it's more complicated and might need some additional information written into the binlog. Unless we in some way can know how many values we are about to insert we might have to write additional information into the log. Needs to be investigated.

An INSERT with wome values defined and some that are give autoincerement shouldn't be a problem as long as the same approach as above is used. And as long as it can be inserted without pk violation on the server it shouldn't be any problem to insert it on the slave. If there is a problem inserting on the slave - well someone has modified the slave database and that is wronig in this master->slave replication.

Ex:
Autoincrement counter in NDB is set to 3.
INSERT INTO t1 (NULL), (17), (NULL);

SELECT * FROM t1;
3
4
17

Auto_increment counter in NDB set to 18 or the highest value "anyone" else has set.

The binlog should look like:
Auto increment start: 3
INSERT INTO t1 (NULL), (17), (NULL);
[24 May 2006 9:14] Guilhem Bichot
assigning it to me, as could be fixed as part of WL#3146.
[31 May 2006 11:21] Guilhem Bichot
More details on what the testcase produces (btw there needs to be a sync_slave_with_master in it):
the INSERT into t2 fails on slave with duplicate entry "1" for primary key.
This is because the first row does use the value set by the SET INSERT_ID=1 present in binlog, but, as NDB resets thd->next_insert_id to 0 immediately, the 2nd row calls the NDB engine (get_auto_increment()) which returns 1.
When my work on WL#3146 is finished, thd->next_insert_id will not be reset to 0 and NDB will not be contacted, so INSERT_ID and INSERT_ID+1 will be inserted.