Bug #50517 mysqldump.test sometimes times out
Submitted: 21 Jan 2010 17:39 Modified: 2 Mar 2010 1:44
Reporter: Ingo Strüwing Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Backup Severity:S2 (Serious)
Version:6.0.14-alpha OS:Any
Assigned to: Ingo Strüwing CPU Architecture:Any

[21 Jan 2010 17:39] Ingo Strüwing
Description:
mysqldump.test sometimes times out:

main.mysqldump [ fail ] timeout after 900 seconds
        Test ended at 2010-01-07 16:35:15

Test case timeout after 900 seconds

== G:/pb2/test/sb_1-1159846-1262876299.98/mysql-5.6.0-beta-win-x86_64-test/mysql-test/var-n_mix/log/mysqldump.log == 

DROP TRIGGER `test trig`;
DROP TABLE `t1 test`;
DROP TABLE `t2 test`;
#
# Bug#12838 mysqldump -x with views exits with error
#
drop table if exists t1;
create table t1 (a int, b varchar(32), c varchar(32));
insert into t1 values (1, 'first value', 'xxxx');
insert into t1 values (2, 'second value', 'tttt');
insert into t1 values (3, 'third value', 'vvv vvv');
create view v1 as select * from t1;
create view v0 as select * from v1;
create view v2 as select * from v0;
select * from v2;
a	b	c
1	first value	xxxx
2	second value	tttt
3	third value	vvv vvv

 == G:/pb2/test/sb_1-1159846-1262876299.98/mysql-5.6.0-beta-win-x86_64-test/mysql-test/var-n_mix/tmp/analyze-timeout-mysqld.1.err ==
SHOW PROCESSLIST;
Id	User	Host	db	Command	Time	State	Info
170	root	localhost:64093	NULL	Query	894	Waiting to get readlock	FLUSH TABLES WITH READ LOCK
171	root	localhost:64109	NULL	Query	0	NULL	SHOW PROCESSLIST

This happens almost only on backup trees, so set category to backup.
The hang happens when mysqldump is called with the -x option. It calls
FLUSH TABLES WITH READ LOCK. Core dump analyze shows that
'protect_against_global_read_lock' is 1. Since no thread is active, which
could hold the protection, some earlier statement must have ended without
resetting this variable.

'protect_against_global_read_lock' is a static variable in lock.cc.
Its value is set in wait_if_global_read_lock() and
start_waiting_global_read_lock(). In both cases it is correctly protected
by LOCK_global_read_lock. The code in lock.cc is correct.

The probable cause for this bug is a successful call to
wait_if_global_read_lock() without a following
start_waiting_global_read_lock().

How to repeat:
Use Pushbuild Xref2 to find failing sqldump test cases.

Suggested fix:
Find the place(s), where start_waiting_global_read_lock()
has been forgotten after a successful wait_if_global_read_lock().
Add the missing call(s).
[21 Jan 2010 17:44] Ingo Strüwing
I suspect a bug in global read lock handling, which could block all changes to the data and meta data. That is, modifying DML and DDL.
[21 Jan 2010 19:28] Ingo Strüwing
I found a possible bypass of start_waiting_global_read_lock() in ha_commit_trans(), where we extended a condition around wait_if_global_read_lock() with BCB_instance->get_shared_lock(). If wait_if_global_read_lock() succeeds and BCB_instance->get_shared_lock() fails, we end up with and incremented 'protect_against_global_read_lock', which is not decremented. Since this change is backup related, and the backup tree hasn't been merged into main for some time, it is a good explanation that the problem pops up in backup trees only.

I found another bypass in log table renaming. But this code exists since 2006. It is pretty unlikely to be the cause for recent timeouts of mysqldump.test.

I have code that fixes the above places. But I don't have a test case yet.
[22 Jan 2010 19:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/97919

3816 Ingo Struewing	2010-01-22
      Bug#50517 - mysqldump.test sometimes times out
      
      mysqldump timed out sporadically on backup trees.
      mysqldump -x blocked in FLUSH TABLES WITH READ LOCK.
      The global variable protect_against_global_read_lock
      had been set to 1 and no other thread was executing
      a statement.
      
      A successful wait_if_global_read_lock() increments
      protect_against_global_read_lock.
      start_waiting_global_read_lock() decrements it.
      
      There were execution paths where wait_if_global_read_lock()
      could be called and start_waiting_global_read_lock()
      skipped. This patch closes these gaps.
     @ mysql-test/suite/backup/r/backup_bml_transaction.result
        Bug#50517 - mysqldump.test sometimes times out
        New test result.
     @ mysql-test/suite/backup/t/backup_bml_transaction.test
        Bug#50517 - mysqldump.test sometimes times out
        New test case.
     @ sql/handler.cc
        Bug#50517 - mysqldump.test sometimes times out
        Added a call to start_waiting_global_read_lock() if
        wait_if_global_read_lock() succeeds and BCB_instance->get_shared_lock
        fails.
        Combined two branches that depend on 'rw_trans' into one.
        Added sync point 'before_ha_commit_trans_grl'.
     @ sql/sql_class.cc
        Bug#50517 - mysqldump.test sometimes times out
        Added a security check to avoid that a thread can end with
        a protection against global read lock left behind.
     @ sql/sql_parse.cc
        Bug#50517 - mysqldump.test sometimes times out
        Fixed macro to compile a sync point.
     @ sql/sql_rename.cc
        Bug#50517 - mysqldump.test sometimes times out
        Fixed error handling to call start_waiting_global_read_lock()
        before return.
[22 Jan 2010 19:39] Chuck Bell
Approved. Made one small suggestion.
[22 Jan 2010 21:34] Hema Sridharan
Patch approved
[22 Jan 2010 22:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/97948

3816 Ingo Struewing	2010-01-22
      Bug#50517 - mysqldump.test sometimes times out
      
      mysqldump timed out sporadically on backup trees.
      mysqldump -x blocked in FLUSH TABLES WITH READ LOCK.
      The global variable protect_against_global_read_lock
      had been set to 1 and no other thread was executing
      a statement.
      
      A successful wait_if_global_read_lock() increments
      protect_against_global_read_lock.
      start_waiting_global_read_lock() decrements it.
      
      There were execution paths where wait_if_global_read_lock()
      could be called and start_waiting_global_read_lock()
      skipped. This patch closes these gaps.
     @ mysql-test/suite/backup/r/backup_bml_transaction.result
        Bug#50517 - mysqldump.test sometimes times out
        New test result.
     @ mysql-test/suite/backup/t/backup_bml_transaction.test
        Bug#50517 - mysqldump.test sometimes times out
        New test case.
     @ sql/handler.cc
        Bug#50517 - mysqldump.test sometimes times out
        Added a call to start_waiting_global_read_lock() if
        wait_if_global_read_lock() succeeds and BCB_instance->get_shared_lock
        fails.
        Combined two branches that depend on 'rw_trans' into one.
        Added sync point 'before_ha_commit_trans_grl'.
     @ sql/sql_class.cc
        Bug#50517 - mysqldump.test sometimes times out
        Added a security check to avoid that a thread can end with
        a protection against global read lock left behind.
     @ sql/sql_parse.cc
        Bug#50517 - mysqldump.test sometimes times out
        Fixed macro to compile a sync point.
     @ sql/sql_rename.cc
        Bug#50517 - mysqldump.test sometimes times out
        Fixed error handling to call start_waiting_global_read_lock()
        before return.
[22 Jan 2010 23:24] Ingo Strüwing
Queued to mysql-6.0-backup-merge.
[20 Feb 2010 9:17] Bugs System
Pushed into 6.0.14-alpha (revid:ingo.struewing@sun.com-20100218152520-s4v1ld76bif06eqn) (version source revid:ingo.struewing@sun.com-20100122220629-sh8tnmtr3haz7dgl) (merge vers: 6.0.14-alpha) (pib:16)
[25 Feb 2010 19:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/101527

3119 Ingo Struewing	2010-02-25
      Backport of revid:ingo.struewing@sun.com-20100122220629-sh8tnmtr3haz7dgl
        Bug#50517 - mysqldump.test sometimes times out
        
        mysqldump timed out sporadically on backup trees.
        mysqldump -x blocked in FLUSH TABLES WITH READ LOCK.
        The global variable protect_against_global_read_lock
        had been set to 1 and no other thread was executing
        a statement.
        
        A successful wait_if_global_read_lock() increments
        protect_against_global_read_lock.
        start_waiting_global_read_lock() decrements it.
        
        There were execution paths where wait_if_global_read_lock()
        could be called and start_waiting_global_read_lock()
        skipped. This patch closes these gaps.
     @ mysql-test/suite/backup/r/backup_bml_transaction.result
        Backport of revid:ingo.struewing@sun.com-20100122220629-sh8tnmtr3haz7dgl
            Bug#50517 - mysqldump.test sometimes times out
            New test result.
     @ mysql-test/suite/backup/t/backup_bml_transaction.test
        Backport of revid:ingo.struewing@sun.com-20100122220629-sh8tnmtr3haz7dgl
            Bug#50517 - mysqldump.test sometimes times out
            New test case.
     @ sql/handler.cc
        Backport of revid:ingo.struewing@sun.com-20100122220629-sh8tnmtr3haz7dgl
            Bug#50517 - mysqldump.test sometimes times out
            Added a call to start_waiting_global_read_lock() if
            wait_if_global_read_lock() succeeds and BCB_instance->get_shared_lock
            fails.
            Combined two branches that depend on 'rw_trans' into one.
            Added sync point 'before_ha_commit_trans_grl'.
     @ sql/sql_rename.cc
        Backport of revid:ingo.struewing@sun.com-20100122220629-sh8tnmtr3haz7dgl
            Bug#50517 - mysqldump.test sometimes times out
            Fixed error handling to call start_waiting_global_read_lock()
            before return.
[2 Mar 2010 1:44] Paul DuBois
Noted in 6.0.14 changelog.

Global read lock handling was incorrect, which could lead to 
mysqldump timeouts.