Bug #70673 Restored backup is different from original
Submitted: 21 Oct 2013 12:17 Modified: 14 Nov 2013 19:48
Reporter: Kevin Rogers Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Command-line Clients Severity:S2 (Serious)
Version:5.0.95 OS:Linux (Red Hat Enterprise Linux Server 5.9)
Assigned to: CPU Architecture:Any
Tags: mysqldump
Triage: Needs Triage: D2 (Serious)

[21 Oct 2013 12:17] Kevin Rogers
Description:
We have a reasonably large database (~225G on disk) that we are in the process of migrating to a new server, by replicating to the new server during the migration and then switching all clients to the new server.

We are moving from a RHEL 5.9 based server to a RHEL 6.4 based server, using the relevant RHEL MySQL versions.

RHEL 5.9 - MySQL 5.0.95-log
RHEL 6.4 - MySQL 5.1.69-log using InnoDB plugin 

The MySQL server holds several databases, all InnoDB tables, so I used the following command to dump the database for the new server:

mysqldump --all-databases --order-by-primary --routines --triggers --opt --single-transaction --flush-privileges --master-data=1

Our server is quite busy all the time, so to avoid downtime, I used the "--single-transaction" option with the expectation this would give me output that would produce a slave consistent with the point in time the backup were initiated. i.e. the following quote from the mysqldump --single-transaction documentation states:

"It is useful only with transactional tables such as InnoDB, because then it dumps the consistent state of the database at the time when START TRANSACTION was issued without blocking any applications."

However, the new server now shows differences from the old relating to auto increment fields. The main differences are:

1) Auto increment fields are higher on the new server than the old
2) Caused by the above, some tables now have the rows that were replicated from the old server to the new, but have different primary key values.

The main problem is that mysqldump with the above options does not produce a snapshot that can be used for replication - If the database is busy enough

As best I can tell, it appears that the auto_increment value that is put into the dump file is not the value at the time the "START TRANSACTION" is issued, but the current value at the time the "SHOW CREATE TABLE" statement is issued. This causes problems if you have triggers that insert values (bug #45677) as the auto increment values used on the slave are different.

This problem means any snapshot produced with the mysqldump --single-transaction option on a busy server will not produce a copy of the database at the point in time the backup was started.

We also use a similar method for backing up our database, so our backups will not be consistent either.

How to repeat:
1) Create a test database:

-- Dumping structure for table test.a
CREATE TABLE IF NOT EXISTS `a` (
  `id` int(10) NOT NULL auto_increment,
  `data` varchar(32) default NULL,
  PRIMARY KEY  (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

-- Dumping structure for table test.b
CREATE TABLE IF NOT EXISTS `b` (
  `id` int(10) NOT NULL auto_increment,
  `data` varchar(32) default NULL,
  PRIMARY KEY  (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

2) Insert large number of records into table a. I inserted 3,000,000 rows into table a, but more would have been better. The number must be large enough that a mysqldump will take longer than a few seconds. 

3) Start a mysqldump of the test database using the following command:

mysqldump --order-by-primary --routines --triggers --opt --single-transaction --flush-privileges test

4) While step (3) is executing, but before it has finished, load data into table b. I used the following PHP to insert 1,000,000 records one at a time:

#!/usr/bin/php
<?php

$db = new PDO( 'mysql:host=localhost;dbname=test', 'devteam', 'd3vt3@m' );

for ($i = 0; $i < 1000000; $i++) {
  $result = $db->exec( 'INSERT INTO b (data) VALUES (MD5(RAND()))' );
  if ( $result === 0 ) {
    print "failed\n";
    print $db->errorCode();
    print "\n";
    var_export( $db->errorInfo() );
    print "\n";
  }
}

5) Wait for the dump in step (3) to finish and look at the dump file. I've removed the header and footer comments and the inserts for the 3,000,000 records as they aren't relevant:

--
-- Table structure for table `a`
--

DROP TABLE IF EXISTS `a`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `a` (
  `id` int(10) NOT NULL auto_increment,
  `data` varchar(32) default NULL,
  PRIMARY KEY  (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3000001 DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `a`
--
-- ORDER BY:  `id`

LOCK TABLES `a` WRITE;
/*!40000 ALTER TABLE `a` DISABLE KEYS */;
-- 3,000,000 records were here...
/*!40000 ALTER TABLE `a` ENABLE KEYS */;
UNLOCK TABLES;

--
-- Table structure for table `b`
--

DROP TABLE IF EXISTS `b`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `b` (
  `id` int(10) NOT NULL auto_increment,
  `data` varchar(32) default NULL,
  PRIMARY KEY  (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=405 DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `b`
--
-- ORDER BY:  `id`

LOCK TABLES `b` WRITE;
/*!40000 ALTER TABLE `b` DISABLE KEYS */;
/*!40000 ALTER TABLE `b` ENABLE KEYS */;
UNLOCK TABLES;

--
-- Dumping routines for database 'test'
--
DELIMITER ;;
DELIMITER ;

6) Note the SQL file in step (5) for table b has an auto_increment value of 405, but there is no data present in the dump file. This causes problems for the slave

Suggested fix:
The ideal fix would be to correct this behaviour, so that mysqldump with the --single-transaction option produces a consistent snapshot.

The less than ideal fix is to update the documentation for all relevant version (at least 5.0 and up) to state explicitly that --lock-all-tables is THE ONLY WAY to ensure the snapshot/backup is consistent.
[21 Oct 2013 13:05] Peter Laursen
I doubt very much that this is related to the --single-transaction option. 

I believe the dump is likely OK, and what goes wrong here, happens on the 5.1 target server (and not the 5.0 source server as it seems that you assume). 

First step to debug this would be to find out it what is in existence after restore is in conformance with the INSERTS in the dump/.sql-file or not. Does the unexpected behaviour occur when exporting from 5.0 or when importing to 5.1?

Also interesting questions are
1) what is the 5.1 server sql_mode? Probably it is 'empty'?
2) 'innodb_autoinc_lock_mode' variable is 0 or 1 on the 5.1 server where you import? Refer http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_autoinc_lock_m.... Default is "1" in MySQL 5.1 (and MySQL 5.0 behaves like setting was "0"). 

Setting 'innodb_autoinc_lock_mode' to "0" while importing the dump may resolve the problem if my assuption is right. Basically my assumption is that the correct iutoincrement value cannot be inserted because of the lock behaviour when innodb_autoinc_lock_mode = 1 and the server will then silently insert next possible value. 'silently' (= no errors returned) because of 'empty' sql_mode. Undoubtedly the dump uses BULK INSERTS? Anyway I may be wrong here and something else could be happening that I am not able to imagine.

Peter
(not a MySQL/Oracle person)
[21 Oct 2013 15:13] Kevin Rogers
Thanks for the reply Peter.

1) Yes, sql_mode is empty on both servers.
2) innodb_autoinc_lock_mode is 1 on the new server.

If you look at the the dump in step 5 of my steps to reproduce, you'll see that the CREATE TABLE statement for table b has an AUTO_INCREMENT value of 405, but no data is present in the data dump below.

This SQL dump implies that from the perspective of the mysqldump (step 3) the AUTO_INCREMENT value for table b was increasing even though the insert (step 4) was running in a different transaction.

It seems that by the time the mysqldump (step 3) got to exporting table b, the insert (step 4) had inserted 404 rows into table b. This is reflected in the AUTO_INCREMENT value, but there is no corresponding 404 rows of data in the export for table b.

I've been looking at this for days, as the new server in one table has different values in the auto increment column for records added since I turned on replication - I may be getting tied up in knots, but something really doesn't seem right. So far, the above steps to reproduce are the closest I've got to showing the problem.

I have a suspicion that this somehow interacts with bug #45677 which means that old versions of MySQL do not properly replicate auto increment values from INSERT statements in triggers - as the table showing differences in the auto increment column (and not just the auto increment value) is updated from a trigger...
[21 Oct 2013 16:44] Kevin Rogers
I've managed to narrow the problem a little further.

The problem affects multiple tables, but one table more significantly than the others.

The affected table, lets call is "t2", has a "hole" in it on the new server (the slave of the old server). Table "t2" has records inserted into it from an ON UPDATE TRIGGER on another table "t1".

On the slave, table t2 has a gap of 54,210 in it's auto-increment id column, but table t1 has contiguous id's. The data before the gap matches on both servers, but it's as if t2.id = t1.id + 54210 after the gap.

This appears to correspond exactly to the period of time while I was running the mysqldump that I initialised the new slave server with.

It's like the auto_increment id that was in the dump file carried on incrementing while the dump was running, and the value written to the dump file was not the value at the time the dump started, but the value at the time the dump got to the table t2. It took at least an hour for the mysqldump to get to table t2.

It then looks like replication started up ok, but when data was inserted into t2 it used the auto increment value from the create table statement.
[14 Nov 2013 19:47] Sveta Smirnova
Thank you for the report.

This is actually known problem: see bug #20786 for discussion and few patches for mysqldump which can solve this issue(especially "[28 Jun 19:54] Sveta Smirnova").

Your case is not easy to solve in 5.0 and 5.1, because this is not mysqldump bug, but the fact that table definition (*frm) is not transactional and InnoDB is transactional.

In version 5.5 and up, where transactional DDL was introduced, this issue does not exist.