Bug #32585 concurrent DDL on partitioned tables causes threads to hang
Submitted: 21 Nov 2007 17:27 Modified: 24 Nov 2007 0:13
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Partitions Severity:S1 (Critical)
Version:5.1.23 OS:Mac OS X
Assigned to: CPU Architecture:Any

[21 Nov 2007 17:27] Shane Bester
Description:
Seems that concurrent DDL is not handling locking correctly.  Testcase quickly causes all threads to hang looking like this (shortened to fit):

mysql> show table status;
Empty set (0.00 sec)

mysql> show processlist;                                     
+----+------+-------------------+----------------------------
| Id | Time | State             | Info                       
+----+------+-------------------+----------------------------
|  3 |    4 |                   | NULL                       
|  4 |  174 | Opening tables    | insert into `t1` values (1)
|  5 |  175 | Waiting for table | insert into `t1` values (1)
|  6 |  175 | Waiting for table | insert into `t1` values (1)
|  7 |  175 | Waiting for table | insert into `t1` values (1)
|  8 |  174 | Waiting for table | insert into `t1` values (1)
|  9 |    0 | NULL              | show processlist           
+----+------+-------------------+----------------------------
7 rows in set (0.02 sec)                                     

sbester@www:~/server/5.1/tmp/mysql-5.1.23-rc-linux-i686/data/test> ll
total 8
-rw-rw----  1 sbester users    7 2007-11-21 19:00 t1#P#p0.MYD
-rw-rw----  1 sbester users 1024 2007-11-21 19:00 t1#P#p0.MYI
sbester@www:~/server/5.1/tmp/mysql-5.1.23-rc-linux-i686/data/test> 

CPU will be nearly maxed out during the hang, so something is spinning.
The threads are waiting for a table which doesn't exist!

How to repeat:
Use the attached C testcase to repeat the problem.  The testcase simply performs these commands in 5 threads:

drop table if exists `t1`;
create table `t1` (`id` int not null) partition by linear hash(`id`) partitions 2;
insert into `t1` values (1);
alter table `t1` coalesce partition 1;

Suggested fix:
fix the locking
[21 Nov 2007 17:45] Shane Bester
testcase. run for some seconds and monitor the processlist for hung queries.

Attachment: bug32585.c (text/plain), 5.19 KiB.

[21 Nov 2007 17:46] Shane Bester
some thread stack traces when they are hung.

Attachment: bug32585_thread_stacks.txt (text/plain), 11.58 KiB.

[21 Nov 2007 17:46] Shane Bester
the testcase actually caused a crash too. here's the stack trace for that.

Attachment: bug32585_crash_stack_trace.txt (text/plain), 1.60 KiB.

[21 Nov 2007 17:50] Shane Bester
Sources built from public mysql-5.1 BK tree:
ChangeSet@1.2620, 2007-11-15 12:31:40+01:00, tnurnberg@white.intern.koehntopp.de
[22 Nov 2007 10:12] Sveta Smirnova
I can repeat only crash, but even without using partitions:

drop table if exists `t1`;

Version 5.0 works fine.
[22 Nov 2007 19:09] Sveta Smirnova
Crash verified on Mac OS X 10.4 Intel with ChangeSet@1.2594.1.1, 2007-10-25 13:17:44+02:00, joerg@trift2. +1 -0

Partitions are not needed to repeat the crash. Just DROP TABLE IF EXISTS ... statement.

On other platforms bug is not repeatable.
[23 Nov 2007 16:02] Mikael Ronström
Sveta, can you please expand on what you  mean with the statement that you don't need
partitioning to cause the crash.
What's the minimal test case that causes the crash and does it involve any partitioning?
[24 Nov 2007 0:13] Sveta Smirnova
Mikael,

below is diff:

$diff -u bug32585.c bug32585_modified.c 
--- bug32585.c  2007-11-24 01:37:14.000000000 +0300
+++ bug32585_modified.c 2007-11-24 01:37:57.000000000 +0300
@@ -152,27 +152,6 @@
                                c=shortquery;
                                c+=sprintf(c,"%s","drop table if exists `t1`");
                                db_query(dbc,shortquery,0);
-
-
-
-                               c=shortquery;
-                               c+=sprintf(c,"%s","create table `t1` (`id` int not null) partition by linear hash(`id`) partitions 2");
-                               db_query(dbc,shortquery,0);
-
-
-
-                               c=shortquery;
-                               c+=sprintf(c,"%s","insert into `t1` values (1)");
-                               db_query(dbc,shortquery,0);
-
-
-
-                               c=shortquery;
-                               c+=sprintf(c,"%s","alter table `t1` coalesce partition 1");
-                               db_query(dbc,shortquery,0);
-        
-
-
        }
 threadexit:
        mysql_close(dbc);

But I tried to repeat it again today. Bug is not repeatable with ChangeSet@1.2634.1.2, 2007-11-21 19:42:50+01:00, df@pippilotta.erinye.com +1 -0 So I closed the report as "Can't repeat"