Bug #40188 | Replication failure on RBR + Innodb + implicit commit + truncate/drop table | ||
---|---|---|---|
Submitted: | 20 Oct 2008 19:51 | Modified: | 21 May 2009 2:21 |
Reporter: | Philip Stoev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Locking | Severity: | S2 (Serious) |
Version: | 6.0 | OS: | Any |
Assigned to: | Davi Arnaut | CPU Architecture: | Any |
[20 Oct 2008 19:51]
Philip Stoev
[20 Oct 2008 19:52]
Philip Stoev
YY file
Attachment: bug40188.yy (application/octet-stream, text), 773 bytes.
[20 Oct 2008 19:52]
Philip Stoev
ZZ file
Attachment: bug40188.zz (text/plain), 125 bytes.
[20 Oct 2008 20:00]
Philip Stoev
To reproduce with the Random Query Generator, clone the mysql-test-extra-6.0 tree and then execute: $ cd mysql-test-extra-6.0/mysql-test/gentest $ perl runall.pl \ --basedir=/path/to/mysql-5.1/ \ --rpl_mode=row \ --gendata=bug40188.zz \ --grammar=bug40188.yy \ --mysqld=--innodb_lock_wait_timeout=1 \ --queries=100000 \ --duration=1200 \ --engine=Innodb This will start a concurrent test and will abort as soon as the slave thread stops.
[20 Oct 2008 20:08]
Valeriy Kravchuk
Thank you for a problem report. How many threads should I start? I get different errors (Unknown table 'a' with 5 threads, Duplicate entry '5' for key PRIMARY with 2 threads etc).
[20 Oct 2008 20:11]
Valeriy Kravchuk
Verified just as described: openxs@suse:/home2/openxs/bzr/mysql-test-extra-6.0/mysql-test/gentest> perl runall.pl --basedir=/home2/openxs/dbs/5.1 --rpl_mode=row --gendata=/home2/openxs/bug40188.zz --grammar=/home2/openxs/bug40188.yy --mysqld=--innodb_lock_wait_timeout=1 --queries=1000000 --duration=1200 --engine=InnoDB # 03:17:48 Please see http://inside.mysql.com/wiki/Category:RandomQueryGenerator for more information on this test framework. # 03:17:48 Starting # 03:17:48 # runall.pl \ # 03:17:48 # --basedir=/home2/openxs/dbs/5.1 \ # 03:17:48 # --rpl_mode=row \ # 03:17:48 # --gendata=/home2/openxs/bug40188.zz \ # 03:17:48 # --grammar=/home2/openxs/bug40188.yy \ # 03:17:48 # --mysqld=--innodb_lock_wait_timeout=1 \ # 03:17:48 # --queries=1000000 \ # 03:17:48 # --duration=1200 \ # 03:17:48 # --engine=InnoDB # 03:17:48 Running perl mysql-test-run.pl --start-and-exit --mysqld=--innodb --mysqld="--innodb_lock_wait_timeout=1" --mysqld=--default-storage-engine=InnoDB --mysqld=--relay-log=master-relay-bin --mysqld=--loose-innodb --mysqld=--loose-falcon-debug-mask=2 --mysqld=--secure-file-priv= --mysqld=--max-allowed-packet=16Mb --master_port=19306 rpl_alter --slave_port=19308 . Logging: mysql-test-run.pl --start-and-exit --mysqld=--innodb --mysqld=--innodb_lock_wait_timeout=1 --mysqld=--default-storage-engine=InnoDB --mysqld=--relay-log=master-relay-bin --mysqld=--loose-innodb --mysqld=--loose-falcon-debug-mask=2 --mysqld=--secure-file-priv= --mysqld=--max-allowed-packet=16Mb --master_port=19306 rpl_alter --slave_port=19308 MySQL Version 5.1.30 Using dynamic switching of binlog format Using default engine 'InnoDB' Using ndbcluster when necessary, mysqld supports it Setting mysqld to support SSL connections Binaries are debug compiled Using MTR_BUILD_THREAD = 0 Using MASTER_MYPORT = 19306 Using MASTER_MYPORT1 = 19307 Using SLAVE_MYPORT = 19308 Using SLAVE_MYPORT1 = 19309 Using SLAVE_MYPORT2 = 19310 Using NDBCLUSTER_PORT = 9311 Using NDBCLUSTER_PORT_SLAVE = 9312 Using IM_PORT = 9313 Using IM_MYSQLD1_PORT = 9314 Using IM_MYSQLD2_PORT = 9315 - adding combinations Killing Possible Leftover Processes Removing Stale Files Creating Directories Installing Master Database Installing Slave1 Database ======================================================= TEST RESULT TIME (ms) ------------------------------------------------------- Servers started, exiting # 03:17:54 Establishing replication, mode row ... # 03:17:54 Starting # 03:17:54 # gentest.pl \ # 03:17:54 # --gendata=/home2/openxs/bug40188.zz \ # 03:17:54 # --engine=InnoDB \ # 03:17:54 # --rpl_mode=row \ # 03:17:54 # --threads=10 \ # 03:17:54 # --queries=1000000 \ # 03:17:54 # --duration=1200 \ # 03:17:54 # --dsn1=dbi:mysql:host=127.0.0.1:port=19306:user=root:database=test \ # 03:17:54 # --grammar=/home2/openxs/bug40188.yy # 03:17:54 Creating table table1_innodb_int_autoinc . # 03:17:55 Reporters: ErrorLog, Backtrace # 03:17:55 Validators: FalconErrors, ErrorMessageCorruption, ReplicationSlaveStatus # 03:17:55 Starting 10 processes, 1000000 queries each, duration 1200 seconds. # 03:17:55 Started periodic reporting process... # 03:17:57 Query: CREATE TABLE IF NOT EXISTS d SELECT * FROM `table1_innodb_int_autoinc` ; TRUNCATE TABLE x failed: 1205 Lock wait timeout exceeded; try restarting transaction # 03:18:01 Query: CREATE TABLE IF NOT EXISTS z SELECT * FROM `e` ; TRUNCATE TABLE n failed: 1205 Lock wait timeout exceeded; try restarting transaction # 03:18:02 Query: CREATE TABLE IF NOT EXISTS x SELECT * FROM `table1_innodb_int_autoinc` ; DROP TABLE IF EXISTS k failed: 1205 Lock wait timeout exceeded; try restarting transaction # 03:18:02 Query: CREATE TABLE IF NOT EXISTS v SELECT * FROM `table1_innodb_int_autoinc` ; TRUNCATE TABLE m failed: 1205 Lock wait timeout exceeded; try restarting transaction # 03:18:04 Query: CREATE TABLE IF NOT EXISTS z SELECT * FROM `table1_innodb_int_autoinc` ; TRUNCATE TABLE i failed: 1205 Lock wait timeout exceeded; try restarting transaction # 03:18:04 Query: CREATE TABLE IF NOT EXISTS q SELECT * FROM `table1_innodb_int_autoinc` ; TRUNCATE TABLE h failed: 1205 Lock wait timeout exceeded; try restarting transaction # 03:18:05 Slave thread has stopped with error: Error 'Table 'test.e' doesn't exist' on opening tables # 03:18:05 Slave thread has stopped with error: Error 'Table 'test.e' doesn't exist' on opening tables # 03:18:05 Slave thread has stopped with error: Error 'Table 'test.e' doesn't exist' on opening tables # 03:18:05 Killing periodic reporting process with pid 7319... # 03:18:05 Slave thread has stopped with error: Error 'Table 'test.e' doesn't exist' on opening tables # 03:18:05 Slave thread has stopped with error: Error 'Table 'test.e' doesn't exist' on opening tables # 03:18:05 Slave thread has stopped with error: Error 'Table 'test.e' doesn't exist' on opening tables # 03:18:05 Slave thread has stopped with error: Error 'Table 'test.e' doesn't exist' on opening tables # 03:18:05 Slave thread has stopped with error: Error 'Table 'test.e' doesn't exist' on opening tables # 03:18:05 Slave thread has stopped with error: Error 'Table 'test.e' doesn't exist' on opening tables # 03:18:05 Slave thread has stopped with error: Error 'Table 'test.e' doesn't exist' on opening tables # 03:18:10 Killing child process with pid 7328... # 03:18:10 Killing child process with pid 7323... # 03:18:10 Killing child process with pid 7320... # 03:18:10 Killing child process with pid 7335... # 03:18:10 Killing child process with pid 7325... # 03:18:10 Killing child process with pid 7333... # 03:18:10 Killing child process with pid 7334... # 03:18:10 Killing child process with pid 7331... # 03:18:10 Killing child process with pid 7322... # 03:18:10 Test completed with failure status 103.
[20 Oct 2008 20:15]
Philip Stoev
The error from this bug report happened at 10 threads for me. Can you please paste the entire text of the other errors you are getting along with how many threads are needed to produce them?
[20 Apr 2009 16:58]
Philip Stoev
6.0-main is still affected by the original test case: # 19:58:06 Slave SQL thread has stopped with error: Error 'Table 'test.m' doesn't exist' on query. Default database: 'test'. Query: 'TRUNCATE TABLE m'
[4 May 2009 14:43]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/73313 2772 Davi Arnaut 2009-05-04 Bug#40188: Replication failure on RBR + Innodb + implicit commit + truncate/drop table The problem is that in some cases transaction-long metadata locks could be released before the transaction was committed. This could happen when a active transaction was ended by a "START TRANSACTION" or "BEGIN" statement, in which case the metadata locks would be released before the actual commit of the active transaction. The solution is to defer the release of metadata locks to after the transaction has been implicitly committed. No test case is provided as the effort to provide one is too disproportional to the size of the fix. @ sql/transaction.cc Commit before transaction-long metadata locks are released.
[6 May 2009 18:52]
Konstantin Osipov
The bug is tagged for 5.1, even though it's repeatable only against 6.0. Please re-triage.
[6 May 2009 20:40]
Davi Arnaut
Queued to 6.0-runtime
[16 May 2009 8:41]
Bugs System
Pushed into 6.0.12-alpha (revid:alik@sun.com-20090516083402-0avycdy7w6dnn0tv) (version source revid:serg@mysql.com-20090512164111-jz224w5781lhe7wd) (merge vers: 6.0.12-alpha) (pib:6)
[21 May 2009 2:21]
Paul DuBois
Noted in 6.0.12 changelog. If a transaction was implicitly committed by a START TRANSACTION or BEGIN statement, metadata locks held by the transaction incorrectly could be released before the commit actually occurred.