Bug #33480 (Falcon) deadlock on inserts into a partitioned auto_increment table
Submitted: 22 Dec 2007 13:44 Modified: 2 May 2008 23:06
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version:6.0.4 OS:Any
Assigned to: Kevin Lewis CPU Architecture:Any

[22 Dec 2007 13:44] Philip Stoev
Description:
If, on a Falcon table with an auto_increment field, partitioned by KEY() on that field, two threads compete in inserting records, deadlock will occur.

It will be impossible to shutdown the server once this happens, and if kill -9 is used, the table will remain corrupted and dropping it will cause a crash.

mysql> show processlist\G
*************************** 1. row ***************************
     Id: 11
   User: root
   Host: localhost:39157
     db: test
Command: Query
   Time: 301
  State: Opening tables
   Info: INSERT INTO t1 (i1) VALUES (FLOOR(rand()*255))
*************************** 2. row ***************************
     Id: 12
   User: root
   Host: localhost:39158
     db: test
Command: Query
   Time: 301
  State: update
   Info: INSERT INTO t1 (i1) VALUES (FLOOR(rand()*255))

How to repeat:
Step 1. Create the following table:

Step 2. Execute the following query repeatedly in two separate threads:

A Gypsy script is available to reproduce the problem, the command line for running it is:

./gypsy --host=127.0.0.1 --user=root --database=test --duration=180 --threads=2 --queryfile=insert-simple.gypsy --log=4 --sqlout=1

Instructions for running Gypsy are available at 

https://intranet.mysql.com/secure/support/KB:Database_Utils:nonprepared#Gypsy

however a simple Perl script will also suffice.

it takes just a few seconds for the server to stall.
[22 Dec 2007 13:45] Philip Stoev
Gypsy script for reproducing bug #33480

Attachment: bug33480.gypsy (text/plain), 232 bytes.

[22 Dec 2007 13:53] Philip Stoev
Innodb with Partitioning is also affected, MyISAM is not. All table types without partitioning are not affected.

Please set the correct category for this bug -- Falcon appears to be affected more since deadlock will be permanent unless a falcon timeout value is configured.
[22 Dec 2007 15:17] Philip Stoev
Falcon with --falcon-lock-timeout=500 allows queries to proceed with "mysql error Lock wait timeout exceeded; try restarting transaction, (1205)", however after a while the deadlock becomes permanent and no further queries can proceed. Kill inside the mysql client does not help and mysqladmin shutdown hangs.

By contrast, Falcon with --innodb_lock_wait_timeout=1 continues to function and shutdown is not a problem.
[23 Dec 2007 12:42] Philip Stoev
I meant that Innodb with a lock timeout does not exhibit that behavoir.
[23 Dec 2007 15:43] MySQL Verification Team
philip, use gyspy's --generate-testcase=1 option to have a testcase.c file written in the current directory.  this testcase.c will be more or less equivalent testcase, so entire gypsy is not needed for the bug report.
[23 Dec 2007 18:03] Philip Stoev
Test case for bug 33480 -- compile instructions are in file's header

Attachment: bug33480.c (text/plain), 4.83 KiB.

[23 Dec 2007 18:04] Philip Stoev
Shane, this is awesome!

I have attached a .c file generated from the Gypsy test case. GCC comand line used to compile is in the header of the file (make sure you adjust the include and lib paths).

The binary testcase will not produce output, so please use SHOW PROCESSLIST to observe.
[26 Dec 2007 16:46] Kevin Lewis
Triage values D2/w3/I3 - The potential deadlock comes when the application uses a combination of null autoinc assignments and specified autoinc values.  See Bug#33079 for a similar issue with the same test.  The first workaround is to use falcon_lock_timeout not equal zero.  The second workaround is to not use partitions on an autoinc field, which only randomizes the data into multiple partitions. It is unnecessary.  The impact is not widespread.
[26 Dec 2007 18:49] Philip Stoev
Kevin, please note that all assigments in the test case are to specific values, no NULL-assigments to the auto_inc column are used.
[26 Dec 2007 19:00] Kevin Lewis
The main insert statement; INSERT INTO t1 (i1) VALUES (FLOOR(rand()*255))
will result in a value of zero one out of 255 times, on average.  This causes the autoinc assignment to happen as if the value inserted was null.
[27 Dec 2007 23:44] Christopher Powers
The deadlock is caused by the mutex in ha_partition::write_row() (see Bug#27405 "Partitioning InnoDB auto_increment")

Transaction 73 (thread 18) is waiting on the mutex in ha_partition::write_row(), 
which is held by transaction 74.

Transaction 74 (thread 17) is waiting for transaction 73, hence the deadlock.

Thread [18] (Suspended)
 15 __lll_mutex_lock_wait()  0x00002adc99b548c8 
 14 _L_mutex_lock_946()  0x00002adc99b50a06 
 13 pthread_mutex_lock()  0x00002adc99b5080b 
 12 ha_partition::write_row() ../libmysqld/ha_partition.cc:2727 0x00000000007a1fc5 
 11 handler::ha_write_row() ../libmysqld/handler.cc:4600 0x00000000007966b0 
 10 write_record() ../libmysqld/sql_insert.cc:1548 0x0000000000722dbb 
 9 mysql_insert() ../libmysqld/sql_insert.cc:803 0x0000000000726849 
 8 mysql_execute_command() ../libmysqld/sql_parse.cc:2658 0x000000000069c61c 
 7 mysql_parse() ../libmysqld/sql_parse.cc:5410 0x00000000006a1e56 
 6 dispatch_command() ../libmysqld/sql_parse.cc:948 0x00000000006a29b6 
 5 do_command() ../libmysqld/sql_parse.cc:697 0x00000000006a3940 
 4 handle_one_connection() ../libmysqld/sql_connect.cc:1146 0x000000000069435a 
 3 start_thread()  0x00002adc99b4e020 
 2 clone()  0x00002adc9b005f8d 
 1 <symbol is not available> 0x0000000000000000 

Thread [17] (Suspended: Breakpoint hit.) 
 25 SyncObject::wait() ../storage/falcon/SyncObject.cpp:383 0x000000000085d4c6 
 24 SyncObject::lock() ../storage/falcon/SyncObject.cpp:259 0x000000000085d8c1 
 23 Sync::lock() ../storage/falcon/Sync.cpp:66 0x000000000085c17f 
 22 Transaction::waitForTransaction() ../storage/falcon/Transaction.cpp:974 0x0000000000870500 
 21 Transaction::getRelativeState() ../storage/falcon/Transaction.cpp:841 0x0000000000870665 
 20 Transaction::getRelativeState() ../storage/falcon/Transaction.cpp:786 0x000000000087072e 
 19 Table::checkUniqueRecordVersion() ../storage/falcon/Table.cpp:2463 0x0000000000865331 
 18 Table::checkUniqueIndex() ../storage/falcon/Table.cpp:2337 0x00000000008656a1 
 17 Table::checkUniqueIndexes() ../storage/falcon/Table.cpp:2311 0x0000000000865813 
 16 Table::insert() ../storage/falcon/Table.cpp:2874 0x0000000000867d72 
 15 StorageDatabase::insert() ../storage/falcon/StorageDatabase.cpp:226 0x0000000000850633 
 14 StorageTable::insert() ../storage/falcon/StorageTable.cpp:122 0x000000000085691b 
 13 StorageInterface::write_row() ../storage/falcon/ha_falcon.cpp:977 0x000000000084a099 
 12 ha_partition::write_row() ../libmysqld/ha_partition.cc:2763 0x00000000007a20b1 
 11 handler::ha_write_row() ../libmysqld/handler.cc:4600 0x00000000007966b0 
 10 write_record() ../libmysqld/sql_insert.cc:1548 0x0000000000722dbb 
 9 mysql_insert() ../libmysqld/sql_insert.cc:803 0x0000000000726849 
 8 mysql_execute_command() ../libmysqld/sql_parse.cc:2658 0x000000000069c61c 
 7 mysql_parse() ../libmysqld/sql_parse.cc:5410 0x00000000006a1e56 
 6 dispatch_command() ../libmysqld/sql_parse.cc:948 0x00000000006a29b6 
 5 do_command() ../libmysqld/sql_parse.cc:697 0x00000000006a3940 
 4 handle_one_connection() ../libmysqld/sql_connect.cc:1146 0x000000000069435a 
 3 start_thread()  0x00002adc99b4e020 
 2 clone()  0x00002adc9b005f8d 
 1 <symbol is not available> 0x0000000000000000
[30 Dec 2007 4:23] Christopher Powers
Same problem with different scenario in Bug#33158, "Server hangs, no new connections are accepted".
[30 Dec 2007 9:02] Philip Stoev
Actually the duplicate bug is bug #33518.
[3 Jan 2008 15:12] Kevin Lewis
Mattias,  Mikael R suggested that we need to use the mutex in ha_partition::write_row() (see Bug#27405) only if the storage engine is InnoDB.  Falcon needs to handle concurrency itself.
[9 Jan 2008 4:26] Kevin Lewis
Assigning this bug back to the Falcon engine.  

Mattias provided a very simple change to ha_partition.cc, ~line 2710, in function ha_partition::write_row();
-   if (table_share->tmp_table == NO_TMP_TABLE)
+   if (table_share->tmp_table == NO_TMP_TABLE && m_innodb)

This prevents the serialization in the partition engine for autoincrement fields.  But then he found that there are still deadlocks/hangs.  

I ran the gypsy test with 5 threads and found that there is indeed an unhandled deadlock in the Falcon engine.  This happens in Table::checkUniqueRecordVersion() where a real transactional deadlock is detected, but an exception is not thrown.  Instead, the call to Table::checkUniqueIndexes() is restarted as if it was just a wait, not a deadlock.
[9 Jan 2008 5:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40733

ChangeSet@1.2773, 2008-01-08 23:22:10-06:00, klewis@klewis-mysql. +2 -0
  Bug#33480 - Two kinds of hangs / deadlocks are fixed here.
  1) In ha_partition.cc, prevent serialization into Falcon of 
  threads writing to autoincrement fields.  Only InnoDB needs 
  this.  It can cause deadlocks and timeouts in Falcon which 
  appear as 0% CPU utilization when all threads are hung.  
  When this happens for just a few threads, the serial log 
  can grow indefinitely.
  
  2) If a deadlock happens between threads which are checking 
  for duplicates, the code in Table::checkUniqueRecordVersion()
  was not handling the detected deadlock by throwing an exception.  
  Now it does.  This can cause 100% CPU utilization hangs between 
  threads.  One is waiting and the other is looping in 
  Table::checkUnigueIndexes().  This can also cause the serial log
  to grow indefinitely.
[9 Jan 2008 5:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40735

ChangeSet@1.2780, 2008-01-08 23:40:07-06:00, klewis@klewis-mysql. +2 -0
  Bug#33480 - Two kinds of hangs / deadlocks are fixed here.
  
  1) In ha_partition.cc, prevent serialization into Falcon of 
  threads writing to autoincrement fields.  Only InnoDB needs 
  this.  It can cause deadlocks and timeouts in Falcon which 
  appear as 0% CPU utilization when all threads are hung.  
  When this happens for just a few threads, the serial log 
  can grow indefinitely.
  
  2) If a deadlock happens between threads which are checking 
  for duplicates, the code in Table::checkUniqueRecordVersion()
  was not handling the detected deadlock by throwing an exception.  
  Now it does.  This can cause 100% CPU utilization hangs between 
  threads.  One is waiting and the other is looping in 
  Table::checkUnigueIndexes().  This can also cause the serial log
  to grow indefinitely.
[11 Jan 2008 17:04] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40919

ChangeSet@1.2780, 2008-01-11 11:03:27-06:00, klewis@klewis-mysql. +2 -0
  Bug#33480 - Two kinds of hangs / deadlocks are fixed here.
  
  1) In ha_partition.cc, prevent serialization into Falcon of 
  threads writing to autoincrement fields.  Only InnoDB needs 
  this.  It can cause deadlocks and timeouts in Falcon which 
  appear as 0% CPU utilization when all threads are hung.  
  When this happens for just a few threads, the serial log 
  can grow indefinitely.
    
  2) If a deadlock happens between threads which are checking 
  for duplicates, the code in Table::checkUniqueRecordVersion()
  was not handling the detected deadlock by throwing an exception.  
  Now it does.  This can cause 100% CPU utilization hangs between 
  threads.  One is waiting and the other is looping in 
  Table::checkUnigueIndexes().  This can also cause the serial log
  to grow indefinitely.
[11 Jan 2008 17:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40924

ChangeSet@1.2781, 2008-01-11 11:22:52-06:00, klewis@klewis-mysql. +2 -0
  Bug#33480 - Two kinds of hangs / deadlocks are fixed here.
  
  1) In ha_partition.cc, prevent serialization into Falcon of 
  threads writing to autoincrement fields.  Only InnoDB needs 
  this.  It can cause deadlocks and timeouts in Falcon which 
  appear as 0% CPU utilization when all threads are hung.  
  When this happens for just a few threads, the serial log 
  can grow indefinitely.
    
  2) If a deadlock happens between threads which are checking 
  for duplicates, the code in Table::checkUniqueRecordVersion()
  was not handling the detected deadlock by throwing an exception.  
  Now it does.  This can cause 100% CPU utilization hangs between 
  threads.  One is waiting and the other is looping in 
  Table::checkUnigueIndexes().  This can also cause the serial log
  to grow indefinitely.
[11 Feb 2008 20:44] Kevin Lewis
Patch is in mysql-6.0-falcon-team and mysql-6.0-release version 6.0.4
[12 Mar 2008 23:02] Bugs System
Pushed into 6.0.4-alpha
[2 May 2008 23:06] Paul DuBois
Noted in 6.0.4 changelog.

Multiple simultaneous inserts into a partitioned Falcon table could
deadlock.