Bug #31479 Bad lock interaction if CREATE TABLE LIKE is killed
Submitted: 9 Oct 2007 13:10 Modified: 18 Dec 2007 21:17
Reporter: Baron Schwartz (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S2 (Serious)
Version:5.0.38, 4.1, 5.0 BK OS:Linux
Assigned to: Davi Arnaut CPU Architecture:Any
Tags: CREATE TABLE LIKE, KILL, qc
Triage: D3 (Medium)

[9 Oct 2007 13:10] Baron Schwartz
Description:
MySQL server locking is not as described in the documentation and can easily cause deadlock on non-transactional tables depending on whether a transaction is active.

The locking section of the manual is contradictory so it is impossible to tell what the real behavior of locking is.

How to repeat:
CREATE TABLE `t1` (
  `a` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO t1(a) VALUES(1);

---- Connection ID 6:
SET AUTOCOMMIT=1;
COMMIT; -- Just to be sure
lock tables t1 read;

---- Connection ID 11:
create table t2 like t1;

---- Connection ID 13:
 show processlist\G
*************************** 1. row ***************************
     Id: 6
   User: root
   Host: localhost
     db: test
Command: Sleep
   Time: 45
  State: 
   Info: NULL
*************************** 2. row ***************************
     Id: 11
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 28
  State: Waiting for table
   Info: create table t2 like t1
*************************** 4. row ***************************
     Id: 13
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 0
  State: NULL
   Info: show processlist
4 rows in set (0.00 sec)

-- Connection ID 12:
select * from t1; -- Waits forever

-- Connection ID 13:
show processlist\G
*************************** 1. row ***************************
     Id: 6
   User: root
   Host: localhost
     db: test
Command: Sleep
   Time: 45
  State: 
   Info: NULL
*************************** 2. row ***************************
     Id: 11
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 28
  State: Waiting for table
   Info: create table t2 like t1
*************************** 3. row ***************************
     Id: 12
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 9
  State: Waiting for table
   Info: select * from t1
*************************** 4. row ***************************
     Id: 13
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 0
  State: NULL
   Info: show processlist
4 rows in set (0.00 sec)

At this point the SELECT on t1 is also blocked, but it should not be; it should be able to continue.  It is blocked because the CREATE TABLE LIKE is blocked.  Let's see what happens if I kill that connection:

-- Connection ID 13:
mysql> kill 11;
Query OK, 0 rows affected (0.00 sec)
mysql> show processlist\G
*************************** 1. row ***************************
     Id: 6
   User: root
   Host: localhost
     db: test
Command: Sleep
   Time: 366
  State: 
   Info: NULL
*************************** 2. row ***************************
     Id: 12
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 788
  State: Waiting for table
   Info: select * from t1
*************************** 3. row ***************************
     Id: 13
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 0
  State: NULL
   Info: show processlist
3 rows in set (0.00 sec)

The SELECT is still blocked and will never complete until thread 6 issues UNLOCK TABLES.  This is very bad.

-- Connection ID 6:
mysql> UNLOCK TABLES;
Query OK, 0 rows affected (0.00 sec)

-- Connection ID 12:
+------+
| a    |
+------+
|    1 | 
+------+

-- Connection ID 11:
ERROR 1053 (08S01): Server shutdown in progress

---------------------------
THIS DOES NOT HAPPEN if connection 6 begins a transaction before locking the table.  In this case, 11 will hang but 12 will not have to wait, which I think is correct behavior.

Suggested fix:
Connection 12 should not have to wait just because connection 11 is in "Waiting for table" status.
[10 Oct 2007 21:57] Sveta Smirnova
Thank you for the report.

Verified as described. Versions since 5.1 are not affected.
[21 Nov 2007 13:45] Konstantin Osipov
Davi, please investigate and reassign to the docs team if the code behaves correctly.
[22 Nov 2007 10:54] Davi Arnaut
Starting from a brief analysis of the test case, it seems this bug
hits three problems:

1) Alter table takes a name-lock

On 5.0 the alter table function needs to acquire a exclusive name-lock
on the source table to ensure that no concurrent DDL operations will
modify it, which also unfortunately prevents other threads from accessing
it. The mysql_create_like_table() contains a detailed explanation on
why this is done this way. Fixed on 5.1 in Bug#23667

How to repeat:

connection c1;
lock tables t1 read;
connection c2;
--send create table t2 like t1;
connection c3;
--send select * from t1;

2) Kill of waiting name-lock (create table like) crashes server

The mysql_create_like_table() has a bug that if the connection is
killed while waiting for the name-lock on the source table, it will
jump to the wrong error path and try to unlock the source table and
LOCK_open, but both weren't locked.

mysql_create_like_table():
  if (lock_and_wait_for_table_name(thd, src_table))
    goto err;

  pthread_mutex_lock(&LOCK_open);

err:
  unlock_table_name(thd, src_table);
  pthread_mutex_unlock(&LOCK_open);
  DBUG_RETURN(res);

How to repeat:

connection c1;
lock tables t1 read;
connection c2;
--send create table t2 like t1;
connection c3;
--send select * from t1;
connection default;
kill 3;

3) Kill of a pending name-lock doesn't broadcast refresh

When a pending name-lock on the source table is cancelled,
other threads waiting for (read) locks on the table are not notified
that they should try again to grab the lock, causing them to
wait indefinitely until the table is unlocked by the first lock
tables (broadcasts a refresh).

How to repeat:

connection c1;
lock tables t1 read;
connection c2;
--send create table t2 like t1;
connection c3;
--send select * from t1;
connection default;
kill 3;
connection c3;
--reap
[23 Nov 2007 16:59] Davi Arnaut
After some internal discussions:

1) Requires risky changes, not worth on 5.0 and is fixed on 5.1+.
2) To be fixed.
3) Requires risky changes, to be addressed once the new MDL is implemented.
[28 Nov 2007 12:11] Davi Arnaut
The problem being fixed in the patch for this bug is a bad lock interaction if CREATE TABLE LIKE is killed while waiting for a name-lock on the source table. The transactional context doesn’t make any difference.

Moreover, expanding my comments:

1) CREATE TABLE takes a name-lock on source table

"On 5.0 the CREATE TABLE LIKE function needs to acquire a exclusive name-lock .."

It's CREATE TABLE LIKE, not alter table.

Please note that this bug is fixed in 5.1

3) This problem is NOT going to be addressed in this bug. It's going to be fixed within the scope of Bug#30701
[28 Nov 2007 12:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/38692

ChangeSet@1.2588, 2007-11-28 10:18:01-02:00, davi@mysql.com +3 -0
  Bug#31479 Bad lock interaction if CREATE TABLE LIKE is killed
  
  Kill of a CREATE TABLE source_table LIKE statement waiting for a
  name-lock on the source table causes a bad lock interaction.
  
  The mysql_create_like_table() has a bug that if the connection is
  killed while waiting for the name-lock on the source table, it will
  jump to the wrong error path and try to unlock the source table and
  LOCK_open, but both weren't locked.
  
  The solution is to simple return when the name lock request is killed,
  it's safe to do so because no lock was acquired and no cleanup is needed.
  
  Original bug report also contains description of other problems
  related to this scenario but they either already fixed in 5.1 or
  will be addressed separately (see bug report for details).
[4 Dec 2007 11:36] Davi Arnaut
Bug#32945 has been marked as a duplicate of this bug.
[6 Dec 2007 9:55] Bugs System
Pushed into 5.0.54
[6 Dec 2007 9:59] Bugs System
Pushed into 5.1.23-rc
[6 Dec 2007 10:01] Bugs System
Pushed into 6.0.5-alpha
[13 Dec 2007 5:43] Dmitry Lenev
Bug #33186 was marked as duplicate of this bug.
[18 Dec 2007 21:17] Paul Dubois
Noted in 5.0.54 changelog.

Killing a CREATE TABLE ... LIKE statement that was waiting for a name
lock caused a server crash. When the statement was killed, the server
attempted to release locks that were not held.