Bug #47924 main.log_tables times out sporadically
Submitted: 8 Oct 2009 19:08 Modified: 30 Dec 2010 8:34
Reporter: Alexander Nozdrin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:5.1 OS:Any
Assigned to: Dmitry Shulga CPU Architecture:Any
Tags: experimental, pb2, sporadic, test failure, timeout

[8 Oct 2009 19:08] Alexander Nozdrin
Description:
main.log_tables times out sporadically in PB.

Symptoms:
main.log_tables                          w6 [ fail ]  timeout after 900 seconds
        Test ended at 2009-10-07 21:20:21

Test case timeout after 900 seconds

== /export/home/pb2/test/sb_3-801427-1254940098.43/mysql-5.1.40-linux-x86_64-test/mysql-test/var-n_mix/6/log/log_tables.log == 
begin
declare v1 int default 0;
declare ER_BAD_LOG_STATEMENT condition for 1575;
declare continue handler for ER_BAD_LOG_STATEMENT begin end;
while v1 < loops do
set @old_log_state = @@global.general_log;
set global general_log = 'OFF';
alter table mysql.general_log engine = CSV;
set global general_log = @old_log_state;
set v1 = v1 + 1;
end while;
end//
set @iterations=100;
"Serial test (proc25422_truncate_slow)"
call proc25422_truncate_slow(@iterations);
"Serial test (proc25422_truncate_general)"
call proc25422_truncate_general(@iterations);
"Serial test (proc25422_alter_slow)"
call proc25422_alter_slow(@iterations);
"Serial test (proc25422_alter_general)"

 == /export/home/pb2/test/sb_3-801427-1254940098.43/mysql-5.1.40-linux-x86_64-test/mysql-test/var-n_mix/6/tmp/analyze-timeout-mysqld.1.err ==
mysqltest: Could not open connection 'default' after 500 attempts: 2002 Can't connect to local MySQL server through socket '/export/home/pb2/test/sb_3-801427-1254940098.43/tmp/g2sp4wPyWd/6/mysqld.1.sock' (111)

How to repeat:
XRef:
http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=2373477&...
[8 Oct 2009 19:09] Alexander Nozdrin
The test case is marked experimental starting from mysql-trunk-bugfixing.
[11 Oct 2009 6:08] Sveta Smirnova
Thank you for the report.

Verified as described. I used repeat=100 to repeat it. (Failed on second attempt)

Test case timeout after 900 seconds

== /users/ssmirnova/blade12/src/mysql-5.1/mysql-test/var/log/log_tables.log == 
`command_type` varchar(64) NOT NULL,
`argument` mediumtext NOT NULL
) ENGINE=CSV DEFAULT CHARSET=utf8 COMMENT='General log';
CREATE TABLE `slow_log` (
`start_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
ON UPDATE CURRENT_TIMESTAMP,
`user_host` mediumtext NOT NULL,
`query_time` time NOT NULL,
`lock_time` time NOT NULL,
`rows_sent` int(11) NOT NULL,
`rows_examined` int(11) NOT NULL,
`db` varchar(512) NOT NULL,
`last_insert_id` int(11) NOT NULL,
`insert_id` int(11) NOT NULL,
`server_id` int(10) unsigned NOT NULL,
`sql_text` mediumtext NOT NULL
) ENGINE=CSV DEFAULT CHARSET=utf8 COMMENT='Slow log';
set global general_log='ON';
set global slow_query_log='ON';
use test;

 == /users/ssmirnova/blade12/src/mysql-5.1/mysql-test/var/tmp/analyze-timeout-mysqld.1.err ==
mysqltest: Could not open connection 'default' after 500 attempts: 2002 Can't connect to local MySQL server through socket '/users/ssmirnova/blade12/src/mysql-5.1/mysql-test/var/tmp/mysqld.1.sock' (111)

 - saving '/users/ssmirnova/blade12/src/mysql-5.1/mysql-test/var/log/main.log_tables/' to '/users/ssmirnova/blade12/src/mysql-5.1/mysql-test/var/log/main.log_tables/'
 - found 'core.19585' (0/5)
Trying 'dbx' to get a backtrace

Trying 'gdb' to get a backtrace
Core generated by '/users/ssmirnova/blade12/src/mysql-5.1/sql/mysqld'
Output from gdb follows. The first stack trace is from the failing thread.
The following stack traces are from all threads (so the failing one is
duplicated).
--------------------------
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/users/ssmirnova/blade12/src/mysql-5.1/sql/mysqld --defaults-group-suffix=.1 --'.
Program terminated with signal 6, Aborted.
#0  0x0000003429e0b002 in pthread_kill () from /lib64/libpthread.so.0
#0  0x0000003429e0b002 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000b1b09d in my_write_core (sig=6) at stacktrace.c:310
#2  0x0000000000693d22 in handle_segfault (sig=6) at mysqld.cc:2570
#3  <signal handler called>
#4  0x00000034292c6952 in __select_nocancel () from /lib64/libc.so.6
#5  0x0000000000697b97 in handle_connections_sockets (arg=0x0) at mysqld.cc:5033
#6  0x0000000000696fbe in main (argc=10, argv=0x7fffddeadc28) at mysqld.cc:4517

Thread 3 (process 19587):
#0  0x0000003429e0da78 in do_sigwait () from /lib64/libpthread.so.0
#1  0x0000003429e0db1d in sigwait () from /lib64/libpthread.so.0
#2  0x0000000000694585 in signal_hand (arg=0x0) at mysqld.cc:2772
#3  0x0000003429e061b5 in start_thread () from /lib64/libpthread.so.0
#4  0x00000034292cd39d in clone () from /lib64/libc.so.6
#5  0x0000000000000000 in ?? ()
Thread 2 (process 19723):
#0  0x0000003429e0a376 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x0000000000b26962 in safe_cond_wait (cond=0x11738e0, mp=0x1172f60, file=0xc1b195 "lock.cc", line=1421) at thr_mutex.c:237
#2  0x000000000068d2e4 in lock_global_read_lock (thd=0x2aaaac007ef8) at lock.cc:1421
#3  0x00000000006b31d5 in reload_acl_and_cache (thd=0x2aaaac007ef8, options=16388, tables=0x0, write_to_binlog=0x40a8110f) at sql_parse.cc:6897
#4  0x00000000006ab2ca in mysql_execute_command (thd=0x2aaaac007ef8) at sql_parse.cc:3974
#5  0x00000000006b0f07 in mysql_parse (thd=0x2aaaac007ef8, inBuf=0x158c66e8 "flush tables with read lock", length=27, found_semicolon=0x40a81ec0) at sql_parse.cc:5963
#6  0x00000000006a348d in dispatch_command (command=COM_QUERY, thd=0x2aaaac007ef8, packet=0x2aaaac00b3a9 "flush tables with read lock", packet_length=27) at sql_parse.cc:1224
#7  0x00000000006a2488 in do_command (thd=0x2aaaac007ef8) at sql_parse.cc:865
#8  0x00000000006a07a1 in handle_one_connection (arg=0x2aaaac007ef8) at sql_connect.cc:1127
#9  0x0000003429e061b5 in start_thread () from /lib64/libpthread.so.0
#10 0x00000034292cd39d in clone () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()

Thread 1 (process 19585):
#0  0x0000003429e0b002 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000b1b09d in my_write_core (sig=6) at stacktrace.c:310
#2  0x0000000000693d22 in handle_segfault (sig=6) at mysqld.cc:2570
#3  <signal handler called>
#4  0x00000034292c6952 in __select_nocancel () from /lib64/libc.so.6
#5  0x0000000000697b97 in handle_connections_sockets (arg=0x0) at mysqld.cc:5033
#6  0x0000000000696fbe in main (argc=10, argv=0x7fffddeadc28) at mysqld.cc:4517
[13 Oct 2010 11:46] Davi Arnaut
The server is 'leaking' the GRL protection on failure to rename table. That is, it sets a protection against a new global read lock before the rename operation but does not release the protection if the rename operation fails. The fix is to make it release the protection, otherwise any subsequent attempt to grab a global read lock will hang.
[2 Nov 2010 11:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/122517

3548 Dmitry Shulga	2010-11-02
      A fix and a test case for Bug#47924 -main.log_tables times out
      sporadically.
      
      The cause of the sporadic time out was a leaking protection
      against the global read lock, taken by the RENAME statement,
      and not released in case of an error occurred during RENAME.
      The leaking protection counter would lead to the value of
      protect_against_global_read never dropping to 0.
      Consequently FLUSH TABLES in all connections, including the
      one that leaked the protection, could not proceed.
       
      The fix is to ensure that all branchesin RENAME code properly
      release GRL protection.
     @ mysql-test/r/log_tables.result
        Added results for test for bug#47924.
     @ mysql-test/t/log_tables.test
        Added test for bug#47924.
     @ sql/sql_rename.cc
        mysql_rename_tables() modified: replaced return from function
        to goto to clean up code block in case of error.
[7 Nov 2010 17:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/123036

3554 Dmitry Shulga	2010-11-07
      A fix and a test case for Bug#47924 -main.log_tables times out
      sporadically.
      
      The cause of the sporadic time out was a leaking protection
      against the global read lock, taken by the RENAME statement,
      and not released in case of an error occurred during RENAME.
      The leaking protection counter would lead to the value of
      protect_against_global_read never dropping to 0.
      Consequently FLUSH TABLES in all connections, including the
      one that leaked the protection, could not proceed.
       
      The fix is to ensure that all branchesin RENAME code properly
      release GRL protection.
     @ mysql-test/r/log_tables.result
        Added results for test for bug#47924.
     @ mysql-test/t/log_tables.test
        Added test for bug#47924.
     @ sql/sql_rename.cc
        mysql_rename_tables() modified: replaced return from function
        to goto to clean up code block in case of error.
[7 Nov 2010 18:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/123038

3261 Dmitry Shulga	2010-11-08 [merge]
      Auto-merge from mysql-5.1-bugteam for bug#47924.
[7 Nov 2010 18:47] Dmitry Shulga
Pushed to mysql-5.1-bugteam and mysql-5.5-bugteam.
[8 Nov 2010 10:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/123060

3331 Dmitry Shulga	2010-11-08 [merge]
      Auto-merge from mysql-5.5-bugteam for bug#47924.
[13 Nov 2010 16:16] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113152450-2zzcm50e7i4j35v7) (merge vers: 5.6.1-m4) (pib:21)
[13 Nov 2010 16:29] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:alexander.nozdrin@oracle.com-20101113152540-gxro4g0v29l27f5x) (pib:21)
[14 Dec 2010 23:42] Paul DuBois
Noted in 5.1.54, 5.5.8 changelogs.

A failed RENAME TABLE operation could prevent a FLUSH TABLES WITH
READ LOCK from completing.
[15 Dec 2010 5:51] Bugs System
Pushed into mysql-5.1 5.1.55 (revid:sunanda.menon@oracle.com-20101215054055-vgwki317xg1wphhh) (version source revid:sunanda.menon@oracle.com-20101215054055-vgwki317xg1wphhh) (merge vers: 5.1.55) (pib:23)
[16 Dec 2010 22:26] Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)
[22 Dec 2010 7:38] Anitha Gopi
This test is still in experimental group. If the test is passing please move this out of default.experimental. Re opening the bug
[30 Dec 2010 8:34] Dmitry Shulga
Test main.log_tables was moved out from default.experimental.