Bug #46682 Replication aborts after DROP table with concurrent active transaction and lock
Submitted: 12 Aug 2009 21:55 Modified: 1 Sep 2009 11:33
Reporter: Elena Stepanova Email Updates:
Status: Can't repeat Impact on me:
Category:MySQL Server: Locking Severity:S2 (Serious)
Version:5.4.4-alpha, azalea OS:Any
Assigned to: Jon Olav Hauglid CPU Architecture:Any
Tags: mdl

[12 Aug 2009 21:55] Elena Stepanova
The problem, described in the bug#989 -- drop table, while performed in parallel with an active transaction, can be written in binlog before the transaction -- still appears with slightly different (more complicated and less reliable) scenario which involves the 3rd connection and, apparently, a race condition. 

The main flow represented by the test case in 'How to repeat' is as follows:

the first connection (master1) has an active transaction which inserted into a table, but did not commit yet;
the second connection (master2) locks the table using WRITE lock;
the third connection (master3) attempts to drop the table and starts waiting;
master1 runs another insert, and also starts waiting;
master2 unlocks the table...

=> after that, one of two things happens:

1) master1 successfully inserts into the table, while master3 keeps waiting. In this case, after master1 commits the transaction, master3 finally drops the table, binary log contains insert followed by drop table, and the cycle ends;
2) master3 drops the table, and insert in master1 fails with error 1146 (table does not exist). In this case, after master1 commits the transaction, binary log contains drop table followed by insert, which makes slave SQL thread abort.

How to repeat:
# t/rpl_drop_table.test

# In my case the problem usually happens after 1-3 attempts, 
# although it is not 100% reliable

let $attempts = 100;

--source include/master-slave.inc
--source include/have_binlog_format_mixed.inc
--source include/have_innodb.inc

connect (master2,,root,,test,$MASTER_MYPORT,);
connect (master3,,root,,test,$MASTER_MYPORT,);

while ($attempts)

	connection default;
	use test;
	purge binary logs before now();

	drop table if exists tbl_rpl_drop;
	create table tbl_rpl_drop ( i int ) engine = innodb;
	connection master1;
	set autocommit = 0;
	insert into tbl_rpl_drop values (1);
	connection master2;
	lock table tbl_rpl_drop WRITE; 
	connection master3;
	send drop table tbl_rpl_drop;
	connection master1;
	send insert into tbl_rpl_drop values (2);
	connection master2;
	unlock tables;
	connection default;
	select sleep(0.3);
	let $table_exists = `select count(*) from information_schema.tables 
		where table_schema='test' and table_name = 'tbl_rpl_drop'`;

	if (!$table_exists) 
		--echo # Table does not exist, which means it was dropped
		let $attempts = 0;
		connection master3;
		echo # This means connection which dropped the table 
		# returned success: $mysql_errno...;
		connection master1;
		echo # And connection which did insert 
		# returned error 1146: $mysql_errno...;
	if ($table_exists)
		dec $attempts;
		--echo # Table still exists, $attempts attempts left
		connection master1;
		echo # This means connection which did insert 
		# returned success: $mysql_errno...;
		connection master3;
		echo # And connection which did drop also 
		# returned success afterwards: $mysql_errno...
		select sleep(0.3);

--real_sleep 1
show binlog events;
connection slave;
let $sql_error = query_get_value( show slave status, 'Last_SQL_Error', 1 );
--echo Last SQL Error from show slave status: $sql_error
[13 Aug 2009 5:59] Sveta Smirnova
Thank you for the report.

Verified as described.