MySQL Bugs: #33365: Cluster PITR does not set AUTO_INCREMENT values correctly during PITR

Bug #33365	Cluster PITR does not set AUTO_INCREMENT values correctly during PITR
Submitted:	19 Dec 2007 12:19	Modified:	4 Apr 2009 12:16
Reporter:	Oli Sennhauser	Email Updates:
Status:	Duplicate	Impact on me:	None
Category:	MySQL Cluster: Replication	Severity:	S2 (Serious)
Version:	mysql-5.1.22-ndb-6.3.6	OS:	Linux (n.a.)
Assigned to:		CPU Architecture:	Any

Description:
During the Cluster Point in Time Recovery (PITR) the values for AUTO_INCREMENT are not set correctly. Thus following inserts will fail with PK violation...

How to repeat:
mysql> 
CREATE TABLE auto_inc_bug (
    id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY
  , ts TIMESTAMP
) ENGINE = ndb
;

mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());
mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());
mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());

mysql> SHOW TABLE STATUS LIKE 'auto_inc_bug';

+--------------+------------+------+----------------+
| Name         | Engine     | Rows | Auto_increment |
+--------------+------------+------+----------------+
| auto_inc_bug | ndbcluster |    3 |              4 |
+--------------+------------+------+----------------+

ndb_mgm> START BACKUP
Connected to Management Server at: localhost:1186
Waiting for completed, this may take several minutes
Node 12: Backup 1 started from node 1
Node 12: Backup 1 started from node 1 completed
 StartGCP: 27852 StopGCP: 27855
 #Records: 44133 #LogRecords: 0
 Data: 1758632 bytes Log: 0 bytes

mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());
mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());
mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());

mysql> SHOW TABLE STATUS LIKE 'auto_inc_bug';
+--------------+------------+------+----------------+
| Name         | Engine     | Rows | Auto_increment |
+--------------+------------+------+----------------+
| auto_inc_bug | ndbcluster |    6 |              7 |
+--------------+------------+------+----------------+
1 row in set (0.05 sec)

mysql>
SELECT SUBSTRING_INDEX(File, '/', -1), Position
  FROM mysql.ndb_binlog_index
 WHERE epoch > 27855
 ORDER BY epoch ASC LIMIT 1;

+--------------------------------+----------+
| SUBSTRING_INDEX(File, '/', -1) | Position |
+--------------------------------+----------+
| binary_log.000020              |     1540 |
+--------------------------------+----------+

===> Up to here everything is fine!

Now:

* Cluster crash
* Cluster restore

# ndb_mgmd
# ndbd --initial
# ndbd --initial
# ndbd --initial
# ndbd --initial

# ndb_restore -n 12 -b 1 -m
# ndb_restore -n 12 -b 1 -r
# ndb_restore -n 13 -b 1 -r
# ndb_restore -n 14 -b 1 -r
# ndb_restore -n 15 -b 1 -r
# ndb_restore -n 15 -b 1 -e

mysql> SHOW TABLE STATUS LIKE 'auto_inc_bug';
+--------------+------------+------+----------------+
| Name         | Engine     | Rows | Auto_increment |
+--------------+------------+------+----------------+
| auto_inc_bug | ndbcluster |    3 |              4 |
+--------------+------------+------+----------------+

===> This is still OK!!!

# mysqlbinlog --start-pos=1540 binary_log.000020 > out
# mysql -u root < out

SELECT * FROM auto_inc_bug;
+----+---------------------+
| id | ts                  |
+----+---------------------+
|  3 | 2007-12-19 12:38:49 |
|  6 | 2007-12-19 12:49:02 |
|  2 | 2007-12-19 12:38:46 |
|  4 | 2007-12-19 12:49:01 |
|  5 | 2007-12-19 12:49:02 |
|  1 | 2007-12-19 12:38:43 |
+----+---------------------+

===> This is still OK!!! But:

mysql> SHOW TABLE STATUS LIKE 'auto_inc_bug';
+--------------+------------+------+----------------+
| Name         | Engine     | Rows | Auto_increment |
+--------------+------------+------+----------------+
| auto_inc_bug | ndbcluster |    6 |              4 |
+--------------+------------+------+----------------+

==> This is wrong!!! Prove:

mysql> INSERT INTO auto_inc_bug VALUES (NULL, NOW());
ERROR 1062 (23000): Duplicate entry '4' for key 'PRIMARY'

Suggested fix:
Auto_increment values must be updated also with row based replication!!!

As far as i can tell this is simply due to the fact that a new "epoch" hasn't yet been started, and so the last transactions (with auto-commit here so each INSERT is a transaction of its own) simply haven't been injected into the binlog yet.

When checking this on our test system i performed the following steps:

- start cluster with binlog enabled
- create table
- insert 3 rows
- do cluster backup
- FLUSH LOGS to rotate the binlog
- insert 4 more rows
- FLUSH LOGS again
- mysqldump 
- shut down mysqld and cluster
- restart cluster initial
- restore cluster backup
- replay 2nd binlog file that covers the time between the 1st and 2nd FLUSH LOGS, including the additional INSERTs
- mysqldump
- replay 3rd binlog which covers the time from 2nd FLUSH LOGS to server shutdown
- yet another mysqldump

comparing the first two mysqldump files the extra rows are missing, so they were not logged into the 2nd binlog file, the 3rd mysqldump has all 7 rows again so the extra 4 rows were only written to that binlog file on mysqld shutdown

conclusion: even with point in time recovery only rows up to the end of the last epoch are covered and everything collected for the currently ongoing epoch is still lost. And even a FLUSH LOGS does not trigger a new epoch.

So not a bug but expected behavior as far as i can tell? 

Setting to "verified" though to get a 2nd opinion from development.

Workaround: After restoring cluster backups do:

 ndb_mgm> ALL DUMP 7099

Forces a LCP, seems to help! (using CGE 6.3.10)

Hartmut,

I dont know that the semantics of flush logs is,
but I guess we can hook-in and have it mean
wait for "current epoch" in cluster...

/jonas

Looking at this again i seem to totally have missed to reproduce the actual auto_increment aspect of the bug ... taking it back on me for now for re-verification ...

Probably a duplicate of bug #42232

Bug still in 6.3.20, now trying 6.3.23 (as soon as it has compiled...)

This bug did NOT reproduce on my 6.3.23 anymore. IMHO it is thus fixed.

Duplicate of bug #42232