| Bug #38043 | Deadlock between server and falcon when executing concurrent ALTER + DML | ||
|---|---|---|---|
| Submitted: | 11 Jul 2008 10:50 | Modified: | 30 Sep 2008 20:36 | 
| Reporter: | Philip Stoev | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Falcon storage engine | Severity: | S1 (Critical) | 
| Version: | 6.0-falcon | OS: | Any | 
| Assigned to: | Christopher Powers | CPU Architecture: | Any | 
   [11 Jul 2008 10:51]
   Philip Stoev        
  Output from kill -HUP: Current locks: lock: 0xa7da2298: lock: 0xa7d720e0: lock: 0xa7d53368: lock: 0xa7d8ebc8: write write : 0x9819830 (9:5); lock: 0xb72da7f0: lock: 0xb7197a50: lock: 0xb7191d10: lock: 0x97d31cc: lock: 0x97c8d6c: lock: 0x97be134: lock: 0x97b4f94: lock: 0x9788bac: lock: 0x979cb14: lock: 0x9797124: lock: 0x978fac4:
   [11 Jul 2008 10:54]
   Philip Stoev        
  Thread stacks for bug 38043
Attachment: bug38043.threads (text/plain), 30.34 KiB.
   [11 Jul 2008 11:26]
   Philip Stoev        
  This was just repeated with just two threads, processlist is as follows:
     Id: 3
   User: root
   Host: localhost:34224
     db: test
Command: Query
   Time: 150
  State: Opening tables
   Info: SELECT AVG( Y . int_key ) FROM C AS X LEFT JOIN D AS Y ON ( X . int_nokey = Y . int_nokey ) WHERE X
*************************** 3. row ***************************
     Id: 4
   User: root
   Host: localhost:34225
     db: test
Command: Query
   Time: 150
  State: closing tables
   Info: UPDATE E AS X SET pk = 75 WHERE X . pk < 81 LIMIT 6
Notice that none of the query is an ALTER. I suspect that a previous ALTER resulted in a table rename which caused table locking to go bad.
The falcon thread is waiting in StorageInterface::setIndexes, as per the attached thread dumps. The mysql thread is in close_open_tables().
 
   [11 Jul 2008 11:31]
   Philip Stoev        
  Grammar file for bug 38043
Attachment: bug38043.yy (application/octet-stream, text), 1.14 KiB.
   [11 Jul 2008 11:35]
   Philip Stoev        
  To reproduce this issue, please clone the mysql-test-extra-6.0 tree and then execute: $ cd mysql-test-extra-6.0/mysql-test/gentest $ perl runall.pl \ --basedir=/path/to/mysql-6.0-falcon \ --engine=falcon \ --grammar=path/to/bug38043.yy \ --mysqld=--log-output=file \ --threads=2 the mysql-6.0 tree does not appear to be affected.
   [11 Jul 2008 17:25]
   Kevin Lewis        
  Chris, This also seems to be related to WL#4048 Online Add/Drop Index
   [14 Jul 2008 5:29]
   Christopher Powers        
  Reproduced with SystemQA testcase. There are two problems: 1. Deadlocks are caused by a pre-existing bug in StorageInterface::setIndexes(), which fails to release an exclusive lock on StorageTableShare::syncObject if StorageTable::setIndex returned an error. 2. Memory corruption is caused by StorageTableShare::getIndex(). Historically, getIndex() updated StorageTableShare::indexes[] with the information of an index being accessed for the first time. The index array was assumed to be of a fixed size, and was allocated during StorageTableShare::open(). For online Add/Drop Index (WL#4048), getIndex() was modified to resize the indexes array if it determined that a new index had been added. The resize operation is not sufficiently protected during concurrent access to the table share, and should instead be performed immediately after createIndex().
   [14 Jul 2008 5:31]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/49643 2741 Christopher Powers 2008-07-14 Bug#38043: Deadlock between server and falcon Temporarily disable online alter until StorageTableShare::getIndex() is fixed.
   [14 Jul 2008 5:35]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/49645 2742 Christopher Powers 2008-07-14 Bug#38043: Deadlock between server and falcon Restore warning in results file until online alter fixed
   [25 Jul 2008 4:42]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/50471 2745 Christopher Powers 2008-07-24 Bug#38039 Assertion lockState == 0 failed in StorageTableShare::deleteTable SyncObject Bug#38041 Bizarre errors when ALTER ADD/DROP KEY on Falcon tables Bug#38043 Deadlock between server and falcon when executing concurrent ALTER + DML Improvements to online add/drop index: - StorageTableShare::indexes[] is now a resizeable DenseArray - Added sync lock protection to get/set index - Call add/drop index from StorageTable rather than StorageTableShare - Rename internal Falcon indexes to preserve index name/id mapping after drop - Immediately update StorageTableShare::indexes[] after add/drop rather than wait for subsequent open()
   [25 Jul 2008 4:48]
   Christopher Powers        
  Short version of the testcase passes. Long version (--threads=10 --queries=1000000) results in error described in Bug#38044.
   [18 Aug 2008 5:48]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/51820 2787 Christopher Powers 2008-08-18 Bug#38044 Falcon crash in StorageTable::compareKey at StorageTable.cpp:412 Bug#38043 Deadlock between server and falcon when executing concurrent ALTER + DML Bug#38039 Assertion lockState == 0 failed in StorageTableShare::deleteTable SyncObject Bug#38041 Bizarre errors when ALTER ADD/DROP KEY on Falcon tables Improved index synchronization in the StorageInterface: - Falcon internal indexes no longer dependent upon server index id - Client connections lock StorageTableShare::syncIndexes to protect indexes across calls into StorageInterface - Changes to StorageTableShare::indexes are now protected by exclusive lock
   [18 Aug 2008 16:56]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/51859 2787 Christopher Powers 2008-08-18 Bug#38044 Falcon crash in StorageTable::compareKey at StorageTable.cpp:412 Bug#38043 Deadlock between server and falcon when executing concurrent ALTER + DML Bug#38039 Assertion lockState == 0 failed in StorageTableShare::deleteTable SyncObject Bug#38041 Bizarre errors when ALTER ADD/DROP KEY on Falcon tables Improved index synchronization in the StorageInterface: - Falcon internal indexes no longer dependent upon server index id - Client connections lock StorageTableShare::syncIndexes to protect indexes across calls into StorageInterface - Changes to StorageTableShare::indexes are now protected by exclusive lock
   [18 Aug 2008 17:06]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/51863 2787 Christopher Powers 2008-08-18 Bug#38044 Falcon crash in StorageTable::compareKey at StorageTable.cpp:412 Bug#38043 Deadlock between server and falcon when executing concurrent ALTER + DML Bug#38039 Assertion lockState == 0 failed in StorageTableShare::deleteTable SyncObject Bug#38041 Bizarre errors when ALTER ADD/DROP KEY on Falcon tables Improved index synchronization in the StorageInterface: - Falcon internal indexes no longer dependent upon server index id - Client connections lock StorageTableShare::syncIndexes to protect indexes across calls into StorageInterface - Changes to StorageTableShare::indexes are now protected by exclusive lock
   [20 Aug 2008 8:08]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/52015 2773 Narayanan V 2008-08-20 [merge] WL#4380 Merging mysql-5.1-sea -> mysql-6.0-sea
   [28 Aug 2008 4:37]
   Bugs System        
  Pushed into 6.0.7-alpha (revid:cpowers@mysql.com-20080818054529-s9swke0xd15rx1fu) (version source revid:vvaintroub@mysql.com-20080827094944-sh582y3m7duhzrpj) (pib:3)
   [14 Sep 2008 4:07]
   Bugs System        
  Pushed into 6.0.6-alpha (revid:cpowers@mysql.com-20080818054529-s9swke0xd15rx1fu) (version source revid:hakan@mysql.com-20080716175219-8unfm2ikarselqpw) (pib:3)
   [30 Sep 2008 20:36]
   Jon Stephens        
  Documented in the 6.0.6 changelog as follows:
        Executing ALTER TABLE and DML statements concurrently on Falcon tables
        could cause the server to hang.
 
Description: When executing a workload involing ALTER ADd/DROP key and DML, the server hanged with the following output of Falcon stalled(): Stalled threads Thread 0xa8602290 (-1475023984) sleep=1, grant=0, locks=3, who 0, parent=(nil) Pending StorageTableShare::setSequenceValue state 0 (1) syncObject 0xa8855978 Thread 0xa8604140 (-1476396144) sleep=0, grant=1, locks=1, who 1, parent=(nil) Stalled synchronization objects: SyncObject a8855978: state -1, readers 0, monitor 0, waiters 1 Exclusive thread a8604140 (-1476396144), type 1; Waiting thread a8602290 (-1475023984), type 1; StorageTableShare::setSequenceValue Processlist looks like this: *************************** 2. row *************************** Id: 4 User: root Host: localhost:53253 db: test Command: Query Time: 418 State: Opening tables Info: UPDATE C AS X SET int_nokey = 8 WHERE X . int_key < 211 LIMIT 1 *************************** 3. row *************************** Id: 5 User: root Host: localhost:53254 db: test Command: Query Time: 417 State: Opening tables Info: INSERT INTO D ( int_nokey , pk ) VALUES ( 87 , 75 ) *************************** 4. row *************************** Id: 6 User: root Host: localhost:53255 db: test Command: Query Time: 419 State: closing tables Info: SELECT AVG( Y . int_key ) FROM E AS X LEFT JOIN E AS Y ON ( X . int_key = Y . int_nokey ) WHERE X . *************************** 5. row *************************** Id: 7 User: root Host: localhost:53256 db: test Command: Query Time: 417 State: Opening tables Info: ALTER TABLE C DROP KEY c *************************** 6. row *************************** Id: 8 User: root Host: localhost:53257 db: test Command: Query Time: 418 State: Waiting for table Info: UPDATE D AS X SET int_key = 114 WHERE X . pk < 132 LIMIT 0 *************************** 7. row *************************** Id: 9 User: root Host: localhost:53258 db: test Command: Query Time: 417 State: update Info: INSERT INTO D ( pk , int_key ) VALUES ( 150 , 0 ) *************************** 8. row *************************** Id: 10 User: root Host: localhost:53259 db: test Command: Query Time: 417 State: Opening tables Info: INSERT INTO D ( pk , int_key ) VALUES ( 105 , 32 ) *************************** 9. row *************************** Id: 11 User: root Host: localhost:53260 db: test Command: Query Time: 418 State: Waiting for table Info: ALTER TABLE D ADD KEY p ( int_nokey ) *************************** 10. row *************************** Id: 12 User: root Host: localhost:53261 db: test Command: Query Time: 417 State: closing tables Info: UPDATE E AS X SET int_key = 5 WHERE X . pk < 173 LIMIT 1 *************************** 11. row *************************** Id: 13 User: root Host: localhost:53262 db: test Command: Query Time: 417 State: closing tables Info: SELECT 237 FROM B AS X LEFT JOIN B AS Y ON ( X . pk = Y . int_nokey ) WHERE X . int_key < 249 GROUP A threads dump will be attached shortly. How to repeat: A test case will be provided if this situation is observed again.