Bug #33480 (Falcon) deadlock on inserts into a partitioned auto_increment table
Submitted: 22 Dec 2007 14:44 Modified: 3 May 2008 1:06
Reporter: Philip Stoev
Status: Closed
Category:Server: Falcon Severity:S3 (Non-critical)
Version:6.0.4 OS:Any
Assigned to: Kevin Lewis Target Version:6.0.4
Triage: D2 (Serious) / R2 (Low) / E2 (Low)

[22 Dec 2007 14:44] Philip Stoev
Description:
If, on a Falcon table with an auto_increment field, partitioned by KEY() on that field,
two threads compete in inserting records, deadlock will occur.

It will be impossible to shutdown the server once this happens, and if kill -9 is used,
the table will remain corrupted and dropping it will cause a crash.

mysql> show processlist\G
*************************** 1. row ***************************
     Id: 11
   User: root
   Host: localhost:39157
     db: test
Command: Query
   Time: 301
  State: Opening tables
   Info: INSERT INTO t1 (i1) VALUES (FLOOR(rand()*255))
*************************** 2. row ***************************
     Id: 12
   User: root
   Host: localhost:39158
     db: test
Command: Query
   Time: 301
  State: update
   Info: INSERT INTO t1 (i1) VALUES (FLOOR(rand()*255))

How to repeat:
Step 1. Create the following table:

Step 2. Execute the following query repeatedly in two separate threads:

A Gypsy script is available to reproduce the problem, the command line for running it
is:

./gypsy --host=127.0.0.1 --user=root --database=test --duration=180 --threads=2
--queryfile=insert-simple.gypsy --log=4 --sqlout=1

Instructions for running Gypsy are available at 

https://intranet.mysql.com/secure/support/KB:Database_Utils:nonprepared#Gypsy

however a simple Perl script will also suffice.

it takes just a few seconds for the server to stall.
[22 Dec 2007 14:45] Philip Stoev
Gypsy script for reproducing bug #33480

Attachment: bug33480.gypsy (text/plain), 232 bytes.

[22 Dec 2007 14:53] Philip Stoev
Innodb with Partitioning is also affected, MyISAM is not. All table types without
partitioning are not affected.

Please set the correct category for this bug -- Falcon appears to be affected more since
deadlock will be permanent unless a falcon timeout value is configured.
[22 Dec 2007 16:17] Philip Stoev
Falcon with --falcon-lock-timeout=500 allows queries to proceed with "mysql error Lock
wait timeout exceeded; try restarting transaction, (1205)", however after a while the
deadlock becomes permanent and no further queries can proceed. Kill inside the mysql
client does not help and mysqladmin shutdown hangs.

By contrast, Falcon with --innodb_lock_wait_timeout=1 continues to function and shutdown
is not a problem.
[23 Dec 2007 13:42] Philip Stoev
I meant that Innodb with a lock timeout does not exhibit that behavoir.
[23 Dec 2007 16:43] Shane Bester
philip, use gyspy's --generate-testcase=1 option to have a testcase.c file written in the
current directory.  this testcase.c will be more or less equivalent testcase, so entire
gypsy is not needed for the bug report.
[23 Dec 2007 19:03] Philip Stoev
Test case for bug 33480 -- compile instructions are in file's header

Attachment: bug33480.c (text/plain), 4.83 KiB.

[23 Dec 2007 19:04] Philip Stoev
Shane, this is awesome!

I have attached a .c file generated from the Gypsy test case. GCC comand line used to
compile is in the header of the file (make sure you adjust the include and lib paths).

The binary testcase will not produce output, so please use SHOW PROCESSLIST to observe.
[26 Dec 2007 17:46] Kevin Lewis
Triage values D2/w3/I3 - The potential deadlock comes when the application uses a
combination of null autoinc assignments and specified autoinc values.  See Bug#33079 for
a similar issue with the same test.  The first workaround is to use falcon_lock_timeout
not equal zero.  The second workaround is to not use partitions on an autoinc field,
which only randomizes the data into multiple partitions. It is unnecessary.  The impact
is not widespread.
[26 Dec 2007 19:49] Philip Stoev
Kevin, please note that all assigments in the test case are to specific values, no
NULL-assigments to the auto_inc column are used.
[26 Dec 2007 20:00] Kevin Lewis
The main insert statement; INSERT INTO t1 (i1) VALUES (FLOOR(rand()*255))
will result in a value of zero one out of 255 times, on average.  This causes the autoinc
assignment to happen as if the value inserted was null.
[28 Dec 2007 0:44] Christopher Powers
The deadlock is caused by the mutex in ha_partition::write_row() (see Bug#27405
"Partitioning InnoDB auto_increment")

Transaction 73 (thread 18) is waiting on the mutex in ha_partition::write_row(), 
which is held by transaction 74.

Transaction 74 (thread 17) is waiting for transaction 73, hence the deadlock.

Thread [18] (Suspended)
 15 __lll_mutex_lock_wait()  0x00002adc99b548c8 
 14 _L_mutex_lock_946()  0x00002adc99b50a06 
 13 pthread_mutex_lock()  0x00002adc99b5080b 
 12 ha_partition::write_row() ../libmysqld/ha_partition.cc:2727 0x00000000007a1fc5 
 11 handler::ha_write_row() ../libmysqld/handler.cc:4600 0x00000000007966b0 
 10 write_record() ../libmysqld/sql_insert.cc:1548 0x0000000000722dbb 
 9 mysql_insert() ../libmysqld/sql_insert.cc:803 0x0000000000726849 
 8 mysql_execute_command() ../libmysqld/sql_parse.cc:2658 0x000000000069c61c 
 7 mysql_parse() ../libmysqld/sql_parse.cc:5410 0x00000000006a1e56 
 6 dispatch_command() ../libmysqld/sql_parse.cc:948 0x00000000006a29b6 
 5 do_command() ../libmysqld/sql_parse.cc:697 0x00000000006a3940 
 4 handle_one_connection() ../libmysqld/sql_connect.cc:1146 0x000000000069435a 
 3 start_thread()  0x00002adc99b4e020 
 2 clone()  0x00002adc9b005f8d 
 1 <symbol is not available> 0x0000000000000000 

Thread [17] (Suspended: Breakpoint hit.) 
 25 SyncObject::wait() ../storage/falcon/SyncObject.cpp:383 0x000000000085d4c6 
 24 SyncObject::lock() ../storage/falcon/SyncObject.cpp:259 0x000000000085d8c1 
 23 Sync::lock() ../storage/falcon/Sync.cpp:66 0x000000000085c17f 
 22 Transaction::waitForTransaction() ../storage/falcon/Transaction.cpp:974
0x0000000000870500 
 21 Transaction::getRelativeState() ../storage/falcon/Transaction.cpp:841
0x0000000000870665 
 20 Transaction::getRelativeState() ../storage/falcon/Transaction.cpp:786
0x000000000087072e 
 19 Table::checkUniqueRecordVersion() ../storage/falcon/Table.cpp:2463 0x0000000000865331

 18 Table::checkUniqueIndex() ../storage/falcon/Table.cpp:2337 0x00000000008656a1 
 17 Table::checkUniqueIndexes() ../storage/falcon/Table.cpp:2311 0x0000000000865813 
 16 Table::insert() ../storage/falcon/Table.cpp:2874 0x0000000000867d72 
 15 StorageDatabase::insert() ../storage/falcon/StorageDatabase.cpp:226
0x0000000000850633 
 14 StorageTable::insert() ../storage/falcon/StorageTable.cpp:122 0x000000000085691b 
 13 StorageInterface::write_row() ../storage/falcon/ha_falcon.cpp:977 0x000000000084a099

 12 ha_partition::write_row() ../libmysqld/ha_partition.cc:2763 0x00000000007a20b1 
 11 handler::ha_write_row() ../libmysqld/handler.cc:4600 0x00000000007966b0 
 10 write_record() ../libmysqld/sql_insert.cc:1548 0x0000000000722dbb 
 9 mysql_insert() ../libmysqld/sql_insert.cc:803 0x0000000000726849 
 8 mysql_execute_command() ../libmysqld/sql_parse.cc:2658 0x000000000069c61c 
 7 mysql_parse() ../libmysqld/sql_parse.cc:5410 0x00000000006a1e56 
 6 dispatch_command() ../libmysqld/sql_parse.cc:948 0x00000000006a29b6 
 5 do_command() ../libmysqld/sql_parse.cc:697 0x00000000006a3940 
 4 handle_one_connection() ../libmysqld/sql_connect.cc:1146 0x000000000069435a 
 3 start_thread()  0x00002adc99b4e020 
 2 clone()  0x00002adc9b005f8d 
 1 <symbol is not available> 0x0000000000000000
[30 Dec 2007 5:23] Christopher Powers
Same problem with different scenario in Bug#33158, "Server hangs, no new connections are
accepted".
[30 Dec 2007 10:02] Philip Stoev
Actually the duplicate bug is bug #33518.
[3 Jan 2008 16:12] Kevin Lewis
Mattias,  Mikael R suggested that we need to use the mutex in ha_partition::write_row()
(see Bug#27405) only if the storage engine is InnoDB.  Falcon needs to handle concurrency
itself.
[9 Jan 2008 5:26] Kevin Lewis
Assigning this bug back to the Falcon engine.  

Mattias provided a very simple change to ha_partition.cc, ~line 2710, in function
ha_partition::write_row();
-   if (table_share->tmp_table == NO_TMP_TABLE)
+   if (table_share->tmp_table == NO_TMP_TABLE && m_innodb)

This prevents the serialization in the partition engine for autoincrement fields.  But
then he found that there are still deadlocks/hangs.  

I ran the gypsy test with 5 threads and found that there is indeed an unhandled deadlock
in the Falcon engine.  This happens in Table::checkUniqueRecordVersion() where a real
transactional deadlock is detected, but an exception is not thrown.  Instead, the call to
Table::checkUniqueIndexes() is restarted as if it was just a wait, not a deadlock.
[9 Jan 2008 6:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40733

ChangeSet@1.2773, 2008-01-08 23:22:10-06:00, klewis@klewis-mysql. +2 -0
  Bug#33480 - Two kinds of hangs / deadlocks are fixed here.
  1) In ha_partition.cc, prevent serialization into Falcon of 
  threads writing to autoincrement fields.  Only InnoDB needs 
  this.  It can cause deadlocks and timeouts in Falcon which 
  appear as 0% CPU utilization when all threads are hung.  
  When this happens for just a few threads, the serial log 
  can grow indefinitely.
  
  2) If a deadlock happens between threads which are checking 
  for duplicates, the code in Table::checkUniqueRecordVersion()
  was not handling the detected deadlock by throwing an exception.  
  Now it does.  This can cause 100% CPU utilization hangs between 
  threads.  One is waiting and the other is looping in 
  Table::checkUnigueIndexes().  This can also cause the serial log
  to grow indefinitely.
[9 Jan 2008 6:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40735

ChangeSet@1.2780, 2008-01-08 23:40:07-06:00, klewis@klewis-mysql. +2 -0
  Bug#33480 - Two kinds of hangs / deadlocks are fixed here.
  
  1) In ha_partition.cc, prevent serialization into Falcon of 
  threads writing to autoincrement fields.  Only InnoDB needs 
  this.  It can cause deadlocks and timeouts in Falcon which 
  appear as 0% CPU utilization when all threads are hung.  
  When this happens for just a few threads, the serial log 
  can grow indefinitely.
  
  2) If a deadlock happens between threads which are checking 
  for duplicates, the code in Table::checkUniqueRecordVersion()
  was not handling the detected deadlock by throwing an exception.  
  Now it does.  This can cause 100% CPU utilization hangs between 
  threads.  One is waiting and the other is looping in 
  Table::checkUnigueIndexes().  This can also cause the serial log
  to grow indefinitely.
[11 Jan 2008 18:04] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40919

ChangeSet@1.2780, 2008-01-11 11:03:27-06:00, klewis@klewis-mysql. +2 -0
  Bug#33480 - Two kinds of hangs / deadlocks are fixed here.
  
  1) In ha_partition.cc, prevent serialization into Falcon of 
  threads writing to autoincrement fields.  Only InnoDB needs 
  this.  It can cause deadlocks and timeouts in Falcon which 
  appear as 0% CPU utilization when all threads are hung.  
  When this happens for just a few threads, the serial log 
  can grow indefinitely.
    
  2) If a deadlock happens between threads which are checking 
  for duplicates, the code in Table::checkUniqueRecordVersion()
  was not handling the detected deadlock by throwing an exception.  
  Now it does.  This can cause 100% CPU utilization hangs between 
  threads.  One is waiting and the other is looping in 
  Table::checkUnigueIndexes().  This can also cause the serial log
  to grow indefinitely.
[11 Jan 2008 18:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40924

ChangeSet@1.2781, 2008-01-11 11:22:52-06:00, klewis@klewis-mysql. +2 -0
  Bug#33480 - Two kinds of hangs / deadlocks are fixed here.
  
  1) In ha_partition.cc, prevent serialization into Falcon of 
  threads writing to autoincrement fields.  Only InnoDB needs 
  this.  It can cause deadlocks and timeouts in Falcon which 
  appear as 0% CPU utilization when all threads are hung.  
  When this happens for just a few threads, the serial log 
  can grow indefinitely.
    
  2) If a deadlock happens between threads which are checking 
  for duplicates, the code in Table::checkUniqueRecordVersion()
  was not handling the detected deadlock by throwing an exception.  
  Now it does.  This can cause 100% CPU utilization hangs between 
  threads.  One is waiting and the other is looping in 
  Table::checkUnigueIndexes().  This can also cause the serial log
  to grow indefinitely.
[11 Feb 2008 21:44] Kevin Lewis
Patch is in mysql-6.0-falcon-team and mysql-6.0-release version 6.0.4
[13 Mar 2008 0:02] Bugs System
Pushed into 6.0.4-alpha
[3 May 2008 1:06] Paul DuBois
Noted in 6.0.4 changelog.

Multiple simultaneous inserts into a partitioned Falcon table could
deadlock.