Bug #33365 Cluster PITR does not set AUTO_INCREMENT values correctly during PITR
Submitted: 19 Dec 2007 12:19 Modified: 4 Apr 2009 12:16
Reporter: Oli Sennhauser Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Replication Severity:S2 (Serious)
Version:mysql-5.1.22-ndb-6.3.6 OS:Linux (n.a.)
Assigned to: CPU Architecture:Any

[19 Dec 2007 12:19] Oli Sennhauser
Description:
During the Cluster Point in Time Recovery (PITR) the values for AUTO_INCREMENT are not set correctly. Thus following inserts will fail with PK violation...

How to repeat:
mysql> 
CREATE TABLE auto_inc_bug (
    id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY
  , ts TIMESTAMP
) ENGINE = ndb
;

mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());
mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());
mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());

mysql> SHOW TABLE STATUS LIKE 'auto_inc_bug';

+--------------+------------+------+----------------+
| Name         | Engine     | Rows | Auto_increment |
+--------------+------------+------+----------------+
| auto_inc_bug | ndbcluster |    3 |              4 |
+--------------+------------+------+----------------+

ndb_mgm> START BACKUP
Connected to Management Server at: localhost:1186
Waiting for completed, this may take several minutes
Node 12: Backup 1 started from node 1
Node 12: Backup 1 started from node 1 completed
 StartGCP: 27852 StopGCP: 27855
 #Records: 44133 #LogRecords: 0
 Data: 1758632 bytes Log: 0 bytes

mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());
mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());
mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());

mysql> SHOW TABLE STATUS LIKE 'auto_inc_bug';
+--------------+------------+------+----------------+
| Name         | Engine     | Rows | Auto_increment |
+--------------+------------+------+----------------+
| auto_inc_bug | ndbcluster |    6 |              7 |
+--------------+------------+------+----------------+
1 row in set (0.05 sec)

mysql>
SELECT SUBSTRING_INDEX(File, '/', -1), Position
  FROM mysql.ndb_binlog_index
 WHERE epoch > 27855
 ORDER BY epoch ASC LIMIT 1;

+--------------------------------+----------+
| SUBSTRING_INDEX(File, '/', -1) | Position |
+--------------------------------+----------+
| binary_log.000020              |     1540 |
+--------------------------------+----------+

===> Up to here everything is fine!

Now:

* Cluster crash
* Cluster restore

# ndb_mgmd
# ndbd --initial
# ndbd --initial
# ndbd --initial
# ndbd --initial

# ndb_restore -n 12 -b 1 -m
# ndb_restore -n 12 -b 1 -r
# ndb_restore -n 13 -b 1 -r
# ndb_restore -n 14 -b 1 -r
# ndb_restore -n 15 -b 1 -r
# ndb_restore -n 15 -b 1 -e

mysql> SHOW TABLE STATUS LIKE 'auto_inc_bug';
+--------------+------------+------+----------------+
| Name         | Engine     | Rows | Auto_increment |
+--------------+------------+------+----------------+
| auto_inc_bug | ndbcluster |    3 |              4 |
+--------------+------------+------+----------------+

===> This is still OK!!!

# mysqlbinlog --start-pos=1540 binary_log.000020 > out
# mysql -u root < out

SELECT * FROM auto_inc_bug;
+----+---------------------+
| id | ts                  |
+----+---------------------+
|  3 | 2007-12-19 12:38:49 |
|  6 | 2007-12-19 12:49:02 |
|  2 | 2007-12-19 12:38:46 |
|  4 | 2007-12-19 12:49:01 |
|  5 | 2007-12-19 12:49:02 |
|  1 | 2007-12-19 12:38:43 |
+----+---------------------+

===> This is still OK!!! But:

mysql> SHOW TABLE STATUS LIKE 'auto_inc_bug';
+--------------+------------+------+----------------+
| Name         | Engine     | Rows | Auto_increment |
+--------------+------------+------+----------------+
| auto_inc_bug | ndbcluster |    6 |              4 |
+--------------+------------+------+----------------+

==> This is wrong!!! Prove:

mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());
ERROR 1062 (23000): Duplicate entry '4' for key 'PRIMARY'

Suggested fix:
Auto_increment values must be updated also with row based replication!!!
[20 Dec 2007 14:45] Hartmut Holzgraefe
As far as i can tell this is simply due to the fact that a new "epoch" hasn't yet been started, and so the last transactions (with auto-commit here so each INSERT is a transaction of its own) simply haven't been injected into the binlog yet.

When checking this on our test system i performed the following steps:

- start cluster with binlog enabled
- create table
- insert 3 rows
- do cluster backup
- FLUSH LOGS to rotate the binlog
- insert 4 more rows
- FLUSH LOGS again
- mysqldump 
- shut down mysqld and cluster
- restart cluster initial
- restore cluster backup
- replay 2nd binlog file that covers the time between the 1st and 2nd FLUSH LOGS, including the additional INSERTs
- mysqldump
- replay 3rd binlog which covers the time from 2nd FLUSH LOGS to server shutdown
- yet another mysqldump

comparing the first two mysqldump files the extra rows are missing, so they were not logged into the 2nd binlog file, the 3rd mysqldump has all 7 rows again so the extra 4 rows were only written to that binlog file on mysqld shutdown

conclusion: even with point in time recovery only rows up to the end of the last epoch are covered and everything collected for the currently ongoing epoch is still lost. And even a FLUSH LOGS does not trigger a new epoch.

So not a bug but expected behavior as far as i can tell? 

Setting to "verified" though to get a 2nd opinion from development.
[29 Mar 2008 14:40] Geert Vanderkelen
Workaround: After restoring cluster backups do:

 ndb_mgm> ALL DUMP 7099

Forces a LCP, seems to help! (using CGE 6.3.10)
[29 Mar 2008 15:21] Jonas Oreland
Hartmut,

I dont know that the semantics of flush logs is,
but I guess we can hook-in and have it mean
wait for "current epoch" in cluster...

/jonas
[31 Mar 2008 12:30] Hartmut Holzgraefe
Looking at this again i seem to totally have missed to reproduce the actual auto_increment aspect of the bug ... taking it back on me for now for re-verification ...
[22 Jan 2009 11:23] Hartmut Holzgraefe
Probably a duplicate of bug #42232
[3 Apr 2009 12:35] Oli Sennhauser
Bug still in 6.3.20, now trying 6.3.23 (as soon as it has compiled...)
[3 Apr 2009 13:17] Oli Sennhauser
This bug did NOT reproduce on my 6.3.23 anymore. IMHO it is thus fixed.
[4 Apr 2009 12:16] Hartmut Holzgraefe
Duplicate of bug #42232