Bug #43230 SELECT ... FOR UPDATE can hang with FLUSH TABLES WITH READ LOCK indefinitely
Submitted: 26 Feb 2009 15:15 Modified: 15 May 2009 2:05
Reporter: Johan Idrén Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S2 (Serious)
Version:5.0-bzr, 5.1-bzr OS:Any
Assigned to: Davi Arnaut CPU Architecture:Any

[26 Feb 2009 15:15] Johan Idrén
Description:
If a SELECT ... FOR UPDATE is executed at the same time as FLUSH TABLES WITH READ LOCK (for example via mysqldump using --master-data), the two can lock endlessly, the SELECT ... FOR UPDATE "Waiting for release of readlock" and the FLUSH.. "Flushing tables", waiting until either is killed, SHOW TABLE STATUS is run or ANALYZE TABLE is run.

Affects both 5.0-bzr and 5.1-bzr, built from bzr 2009-02-25. Tested with both MyISAM and InnoDB.

How to repeat:
mysql> CREATE DATABASE TEST;
mysql> CREATE TABLE `buggy` (
                    `id` int NOT NULL,
                     KEY `idx1` (`id`)
                ) ENGINE=MyISAM;
mysql> INSERT INTO buggy VALUES(13);

In one terminal

shell> while true; do mysql -uroot -e "SELECT id FROM buggy WHERE id=13 FOR UPDATE" test ; done

In another

shell> while true ; do mysql -uroot -e "FLUSH TABLES WITH READ LOCK" test ; done

Wait for
...
| 101979 | root        | localhost | test | Query   |   85 | Waiting for release of readlock                                       | SELECT id FROM buggy WHERE id = 13 FOR UPDATE | 
| 101980 | root        | localhost | test | Query   |   85 | Flushing tables                                                       | FLUSH TABLES WITH READ LOCK                    | 
+--------+-------------+-----------+------+---------+------+-----------------------------------------------------------------------+------------------------------------------------+

It will hang until analyzing buggy or running show table status.

Suggested fix:
Find where it's hanging and make sure it can't.
[9 Mar 2009 10:15] Johan Idrén
For older versions of mysqld (tested on 5.0.22), killing one of the threads from the client works, while SHOW TABLE STATUS and ANALYZE TABLE does not.
[12 Mar 2009 10:05] Johan Idrén
mysqladmin debug output:

Status information:

Current dir: /usr/local/data/50/
Running threads: 4  Stack size: 262144
Current locks:
lock: 0x100866f40:

Key caches:
default
Buffer_size:      16777216
Block_size:           1024
Division_limit:        100
Age_limit:             300
blocks used:             0
not flushed:             0
w_requests:              0
writes:                  0
r_requests:              0
reads:                   0

handler status:
read_key:            0
read_next:           0
read_rnd             0
read_first:       7208
write:               0
delete               0
update:              0

Table status:
Opened tables:       5733
Open tables:            1
Open files:            10
Open streams:           0

Alarm status:
Active alarms:   1
Max used alarms: 5
Next alarm time: 28790
[16 Mar 2009 12:20] Davi Arnaut
There is a race between setting the global read lock and waiting for all open tables to be closed. The problem is a circular wait where a thread holding "old" open tables will wait for the global read lock to be released while the thread holding the global read lock will wait for all "old" open tables to be closed -- the flush part of flush tables with read lock. This problem is difficult to reproduce on 5.0 and 5.1 because there are spurious refresh broadcasts, but on 6.0 the race gap is much wider and consequently it is easier to reproduce the lock up there.

Test case:

set global general_log=off;
set global slow_query_log=off;
create table t1 (a int);
connect (con1,localhost,root,,);
connect (con2,localhost,root,,);
connection con1;
set debug_sync='sp1 SIGNAL b1 WAIT_FOR b2';
set debug_sync='sp2 SIGNAL b3 WAIT_FOR b4';
--send select * from t1 where a = 13 for update;
connection con2;
set debug_sync='now WAIT_FOR b1';
set debug_sync='sp3 SIGNAL b2 WAIT_FOR b3';
set debug_sync='sp4 SIGNAL b4';
flush tables with read lock;

Debug sync points:

=== modified file 'sql/lock.cc'
--- sql/lock.cc	2009-03-13 18:28:10 +0000
+++ sql/lock.cc	2009-03-16 11:56:27 +0000
@@ -265,6 +265,8 @@ MYSQL_LOCK *mysql_lock_tables(THD *thd, 
                                    &write_lock_used)))
       break;
 
+    DEBUG_SYNC(thd, "sp1");
+
     if (global_read_lock && write_lock_used &&
         ! (flags & MYSQL_LOCK_IGNORE_GLOBAL_READ_LOCK))
     {
@@ -1300,6 +1302,7 @@ bool wait_if_global_read_lock(THD *thd, 
 	   (!abort_on_refresh || thd->version == refresh_version))
     {
       DBUG_PRINT("signal", ("Waiting for COND_global_read_lock"));
+      DEBUG_SYNC(thd, "sp2");
       (void) pthread_cond_wait(&COND_global_read_lock, &LOCK_global_read_lock);
       DBUG_PRINT("signal", ("Got COND_global_read_lock"));
     }

=== modified file 'sql/sql_base.cc'
--- sql/sql_base.cc	2009-03-09 12:17:41 +0000
+++ sql/sql_base.cc	2009-03-16 11:49:40 +0000
@@ -904,6 +904,8 @@ bool close_cached_tables(THD *thd, TABLE
   DBUG_ENTER("close_cached_tables");
   DBUG_ASSERT(thd || (!wait_for_refresh && !tables));
 
+  DEBUG_SYNC(thd, "sp3");
+
   if (!have_lock)
     pthread_mutex_lock(&LOCK_open);
   if (!tables)
@@ -1027,6 +1029,7 @@ bool close_cached_tables(THD *thd, TABLE
     if (found)
     {
       DBUG_PRINT("signal", ("Waiting for COND_refresh"));
+      DEBUG_SYNC(thd, "sp4");
       pthread_cond_wait(&COND_refresh,&LOCK_open);
     }
[24 Mar 2009 12:28] Johan Idrén
Davi,

Any update on possibility to fix this? Time to fix? Any progress?
[27 Mar 2009 13:11] Davi Arnaut
Ran,

The deadlock is not actually a matter of releasing locks, the deadlock is due to a race condition where a thread waits for a event (condition) that was already signaled by a concurrent thread.

So to "resolve" the deadlock, one might artificially trigger the broadcast of the event by locking and unlocking a foobar (unrelated) table:

mysql> lock tables foobar read;
mysql> unlock tables;
[1 Apr 2009 23:58] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71150

2724 Davi Arnaut	2009-04-01
      Bug#43230: SELECT ... FOR UPDATE can hang with FLUSH TABLES WITH READ LOCK indefinitely
      
      The problem is that a SELECT .. FOR UPDATE statement might open
      a table and later wait for a impeding global read lock without
      noticing whether it is holding a table that is being waited upon
      the the flush phase of the process that took the global read
      lock.
      
      The solution is to make a SELECT .. FOR UPDATE statement wait
      for a impending global read lock before opening the tables.
      If there is no impending global read lock, the statement raises
      a temporary protection against global read locks and progresses
      smoothly towards completion.
     @ mysql-test/r/lock_multi.result
        Add test case result for Bug#43230
     @ mysql-test/t/lock_multi.test
        Add test case for Bug#43230
     @ sql/sql_lex.cc
        Initialize flag.
     @ sql/sql_lex.h
        Add a flag to the lexer.
     @ sql/sql_parse.cc
        Wait for the global read lock is a write lock is going to be
        taken. The wait is done before opening tables.
     @ sql/sql_yacc.yy
        Protect against the GRL if its a SELECT .. FOR UPDATE statement.
[3 Apr 2009 15:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71331

2726 Davi Arnaut	2009-04-03
      Bug#43230: SELECT ... FOR UPDATE can hang with FLUSH TABLES WITH READ LOCK indefinitely
      
      The problem is that a SELECT .. FOR UPDATE statement might open
      a table and later wait for a impeding global read lock without
      noticing whether it is holding a table that is being waited upon
      the the flush phase of the process that took the global read lock.
      
      The same problem also affected the following statements:
      
      LOCK TABLES .. WRITE UPDATE .. SET (update and multi-table update)
      LOAD DATA ..
      
      The solution is to make the above statements wait for a impending
      global read lock before opening the tables. If there is no
      impending global read lock, the statement raises a temporary
      protection against global read locks and progresses smoothly
      towards completion.
     @ mysql-test/r/lock_multi.result
        Add test case result for Bug#43230
     @ mysql-test/t/lock_multi.test
        Add test case for Bug#43230
     @ sql/sql_lex.cc
        Initialize flag.
     @ sql/sql_lex.h
        Add a flag to the lexer.
     @ sql/sql_parse.cc
        Wait for the global read lock is a write lock is going to be
        taken. The wait is done before opening tables.
     @ sql/sql_yacc.yy
        Protect against the GRL if its a SELECT .. FOR UPDATE or LOCK TABLES
        .. WRITE statement.
[3 Apr 2009 17:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71340

2726 Davi Arnaut	2009-04-03
      Bug#43230: SELECT ... FOR UPDATE can hang with FLUSH TABLES WITH
      READ LOCK indefinitely
      
      The problem is that a SELECT .. FOR UPDATE statement might open
      a table and later wait for a impeding global read lock without
      noticing whether it is holding a table that is being waited upon
      the the flush phase of the process that took the global read lock.
      
      The same problem also affected the following statements:
      
      LOCK TABLES .. WRITE
      UPDATE .. SET (update and multi-table update)
      LOAD DATA ..
      
      The solution is to make the above statements wait for a impending
      global read lock before opening the tables. If there is no
      impending global read lock, the statement raises a temporary
      protection against global read locks and progresses smoothly
      towards completion.
      
      Important notice: the patch does not try to address all possible
      cases, only those which are widespread and can be fixed unintrusively
      enough for 5.0.
     @ mysql-test/r/lock_multi.result
        Add test case result for Bug#43230
     @ mysql-test/t/lock_multi.test
        Add test case for Bug#43230
     @ sql/sql_delete.cc
        Wait for the global read lock before opening tables. Move from
        the caller because this function is also used by truncate table.
     @ sql/sql_lex.cc
        Initialize flag.
     @ sql/sql_lex.h
        Add a flag to the lexer.
     @ sql/sql_parse.cc
        Wait for the global read lock is a write lock is going to be
        taken. The wait is done before opening tables.
        
        Wait for GRL in DELETE is now done within mysql_delete.
     @ sql/sql_yacc.yy
        tect against the GRL if its a SELECT .. FOR UPDATE or LOCK TABLES
        .. WRITE statement.
[3 Apr 2009 18:19] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71342

2726 Davi Arnaut	2009-04-03
      Bug#43230: SELECT ... FOR UPDATE can hang with FLUSH TABLES WITH READ LOCK indefinitely
      
      The problem is that a SELECT .. FOR UPDATE statement might open
      a table and later wait for a impeding global read lock without
      noticing whether it is holding a table that is being waited upon
      the the flush phase of the process that took the global read lock.
      
      The same problem also affected the following statements:
      
      LOCK TABLES .. WRITE
      UPDATE .. SET (update and multi-table update)
      TRUNCATE TABLE ..
      LOAD DATA ..
      
      The solution is to make the above statements wait for a impending
      global read lock before opening the tables. If there is no
      impending global read lock, the statement raises a temporary
      protection against global read locks and progresses smoothly
      towards completion.
      
      Important notice: the patch does not try to address all possible
      cases, only those which are widespread and can be fixed unintrusively
      enough for 5.0.
     @ mysql-test/r/lock_multi.result
        Add test case result for Bug#43230
     @ mysql-test/t/lock_multi.test
        Add test case for Bug#43230
     @ sql/sql_lex.cc
        Initialize flag.
     @ sql/sql_lex.h
        Add a flag to the lexer.
     @ sql/sql_parse.cc
        Wait for the global read lock is a write lock is going to be
        taken. The wait is done before opening tables.
     @ sql/sql_yacc.yy
        Protect against the GRL if its a SELECT .. FOR UPDATE or LOCK TABLES
        .. WRITE statement.
[3 Apr 2009 19:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71348

2726 Davi Arnaut	2009-04-03
      Bug#43230: SELECT ... FOR UPDATE can hang with FLUSH TABLES WITH READ LOCK indefinitely
      
      The problem is that a SELECT .. FOR UPDATE statement might open
      a table and later wait for a impeding global read lock without
      noticing whether it is holding a table that is being waited upon
      the the flush phase of the process that took the global read
      lock.
      
      The same problem also affected the following statements:
      
      LOCK TABLES .. WRITE
      UPDATE .. SET (update and multi-table update)
      TRUNCATE TABLE ..
      LOAD DATA ..
      
      The solution is to make the above statements wait for a impending
      global read lock before opening the tables. If there is no
      impending global read lock, the statement raises a temporary
      protection against global read locks and progresses smoothly
      towards completion.
      
      Important notice: the patch does not try to address all possible
      cases, only those which are common and can be fixed unintrusively
      enough for 5.0.
     @ mysql-test/r/lock_multi.result
        Add test case result for Bug#43230
     @ mysql-test/t/lock_multi.test
        Add test case for Bug#43230
     @ sql/sql_lex.cc
        Initialize flag.
     @ sql/sql_lex.h
        Add a flag to the lexer.
     @ sql/sql_parse.cc
        Wait for the global read lock is a write lock is going to be
        taken. The wait is done before opening tables.
     @ sql/sql_yacc.yy
        Protect against the GRL if its a SELECT .. FOR UPDATE or LOCK TABLES
        .. WRITE statement.
[3 Apr 2009 21:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71362

3179 Davi Arnaut	2009-04-03 [merge]
      Merge Bug#43230 into mysql-6.0-bugteam
[3 Apr 2009 21:15] Davi Arnaut
Queued into 5.0-bugteam and up
[3 Apr 2009 21:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71365

2855 Davi Arnaut	2009-04-03 [merge]
      Merge Bug#43230 into mysql-5.1-bugteam
[5 Apr 2009 14:44] Davi Arnaut
One way to unhang the threads is to issue a "FLUSH TABLES" statement. This will cause
the "FLUSH TABLES WITH READ LOCK" to continue and a subsequent "UNLOCK TABLES" (which
releases the global read lock) will allow the "SELECT .. FOR UPDATE" statement to
progress.
[5 May 2009 18:52] Bugs System
Pushed into 5.0.82 (revid:davi.arnaut@sun.com-20090505184158-dvmedh8n472y8np5) (version source revid:davi.arnaut@sun.com-20090505184158-dvmedh8n472y8np5) (merge vers: 5.0.82) (pib:6)
[5 May 2009 19:40] Bugs System
Pushed into 5.1.35 (revid:davi.arnaut@sun.com-20090505190206-9xmh7dlc6kom8exp) (version source revid:davi.arnaut@sun.com-20090505190206-9xmh7dlc6kom8exp) (merge vers: 5.1.35) (pib:6)
[6 May 2009 14:09] Bugs System
Pushed into 6.0.12-alpha (revid:svoj@sun.com-20090506125450-yokcmvqf2g7jhujq) (version source revid:davi.arnaut@sun.com-20090403210707-fuwh3s3lb3utknrm) (merge vers: 6.0.11-alpha) (pib:6)
[15 May 2009 2:05] Paul DuBois
Noted in 5.0.82, 5.1.35, 6.0.12 changelogs.

Certain statements might open a table and then wait for an impending
global read lock without noticing whether they hold a table being
waiting for by the global read lock, causing a hang. Affected
statements are SELECT ... FOR UPDATE, LOCK TABLES ... WRITE, TRUNCATE
TABLE, and LOAD DATA INFILE.
[15 Jun 2009 8:27] Bugs System
Pushed into 5.1.35-ndb-6.3.26 (revid:jonas@mysql.com-20090615074202-0r5r2jmi83tww6sf) (version source revid:jonas@mysql.com-20090615070837-9pccutgc7repvb4d) (merge vers: 5.1.35-ndb-6.3.26) (pib:6)
[15 Jun 2009 9:06] Bugs System
Pushed into 5.1.35-ndb-7.0.7 (revid:jonas@mysql.com-20090615074335-9hcltksp5cu5fucn) (version source revid:jonas@mysql.com-20090615072714-rmfkvrbbipd9r32c) (merge vers: 5.1.35-ndb-7.0.7) (pib:6)
[15 Jun 2009 9:47] Bugs System
Pushed into 5.1.35-ndb-6.2.19 (revid:jonas@mysql.com-20090615061520-sq7ds4yw299ggugm) (version source revid:jonas@mysql.com-20090615054654-ebgpz7elwu1xj36j) (merge vers: 5.1.35-ndb-6.2.19) (pib:6)
[10 Jul 2009 23:19] Bugs System
Pushed into 5.1.37 (revid:build@mysql.com-20090710231213-9guqdu0avc0uwdkp) (version source revid:build@mysql.com-20090710231213-9guqdu0avc0uwdkp) (merge vers: 5.1.37) (pib:11)
[23 Jul 2009 10:24] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090723102221-ps4uaphwbxzj8p0q) (version source revid:joerg@mysql.com-20090721145751-rqqnhv0kage18wfi) (merge vers: 5.4.4-alpha) (pib:11)
[26 Aug 2009 13:46] Bugs System
Pushed into 5.1.37-ndb-7.0.8 (revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[26 Aug 2009 13:46] Bugs System
Pushed into 5.1.37-ndb-6.3.27 (revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (version source revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (merge vers: 5.1.37-ndb-6.3.27) (pib:11)
[26 Aug 2009 13:48] Bugs System
Pushed into 5.1.37-ndb-6.2.19 (revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (version source revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (merge vers: 5.1.37-ndb-6.2.19) (pib:11)
[27 Aug 2009 16:33] Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:magnus.blaudd@sun.com-20090827163030-6o3kk6r2oua159hr) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)