Bug #42874 Falcon 100% CPU loop during recovery in IndexPage::findInsertionPoint
Submitted: 16 Feb 2009 10:33 Modified: 5 Mar 2009 13:19
Reporter: Philip Stoev Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0-falcon-team OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: F_RECOVERY

[16 Feb 2009 10:33] Philip Stoev
Description:
When recovering (kill -9) from the following RQG test (very simple zero-sum transactions in 4 threads, no ALTER/OPTIMIZE/REPAIR/etc.):

$ perl runall.pl  --mem --rows=1000 --threads=4 \
  --mysqld=--falcon-page-cache-size=1K \
  --basedir=/build/bzr/6.0-falcon-team \
  --mask=2515 \
  --queries=10000000 \
  --duration=900 \
  --engine=Falcon \
  --grammar=conf/transactions-flat.yy \
  --gendata=conf/transactions.zz \
  --reporters=Deadlock,ErrorLog,Backtrace,Recovery \
  --mysqld=--loose-falcon-lock-wait-timeout=1 \
  --mysqld=--loose-innodb-lock-wait-timeout=1 \
  --mysqld=--log-output=none \
  --mysqld=--skip-safemalloc  

, Falcon entered a 100% CPU loop in the following code:

889                     while (number == END_BUCKET && nextKey.compare(&key) > 0)
891                             if (!page->nextPage)
905                             nextKey.compare(&key);
906                             ASSERT (bdb->pageNumber != page->nextPage);
907                             bdb = dbb->handoffPage (bdb, page->nextPage, PAGE_btree, Exclusive);
909                             page = (IndexPage*) bdb->buffer;
910                             node.parseNode(page->findInsertionPoint(&key, recordNumber, &priorKey));
911                             nextKey.setKey(0, node.offset, priorKey.key);
912                             node.expandKey(&nextKey);
913                             number = node.getNumber();
914                             ++rollovers;

backtrace is:

#5  0x0000000000a91d71 in IndexPage::findInsertionPoint (this=0x7f377c8ed000, level=0, indexKey=0x7fff90d6bef0, recordNumber=19931,
    expandedKey=0x7fff90d66a30, from=0x7f377c8ed03c, bucketEnd=0x7f377c8ed8a6) at IndexPage.cpp:1300
#6  0x0000000000a91e37 in IndexPage::findInsertionPoint (this=0x7f377c8ed000, indexKey=0x7fff90d6bef0, recordNumber=19931, expandedKey=0x7fff90d66a30)
    at IndexPage.cpp:1225
#7  0x00000000009f8744 in IndexRootPage::indexMerge (dbb=0x7f37879d92c0, indexId=1, logRecord=0x7fff90d6f158, transId=0) at IndexRootPage.cpp:910
#8  0x0000000000a69548 in SRLUpdateIndex::execute (this=0x7fff90d6f158) at SRLUpdateIndex.cpp:217
#9  0x0000000000a69579 in SRLUpdateIndex::redo (this=0x7fff90d6f158) at SRLUpdateIndex.cpp:187
#10 0x0000000000a4b65a in SerialLog::recover (this=0x7f37879bfb20) at SerialLog.cpp:412
#11 0x00000000009d2ccb in Database::openDatabase (this=0x7f3787858bc8, filename=0x7fff90d6f7a0 "/dev/shm/var/master-data_recovery/falcon_master.fts")
    at Database.cpp:753
#12 0x00000000009c2342 in Connection::getDatabase (this=0x7f37876577d0, dbName=0x7f378765739c "FALCON_MASTER",
    dbFileName=0x7fff90d6f7a0 "/dev/shm/var/master-data_recovery/falcon_master.fts", threads=0x7f3787657408) at Connection.cpp:1650
#13 0x00000000009c3682 in Connection::openDatabase (this=0x7f37876577d0, dbName=0x7f378765739c "FALCON_MASTER", filename=0x7f37876573d4 "falcon_master.fts",
    account=0xe31740 "mysql", password=0xe31740 "mysql", address=0x0, parent=0x7f3787657408) at Connection.cpp:933
#14 0x0000000000979071 in StorageDatabase::getOpenConnection (this=0x7f3787657208) at StorageDatabase.cpp:136
#15 0x000000000097c10f in StorageHandler::initialize (this=0x7f3787858048) at StorageHandler.cpp:987
#16 0x0000000000972b79 in StorageInterface::falcon_init (p=0x28024f8) at ha_falcon.cpp:257
#17 0x0000000000818f0e in ha_initialize_handlerton (plugin=0x27d5000) at handler.cc:450
#18 0x00000000008cf538 in plugin_initialize (plugin=0x27d5000) at sql_plugin.cc:1008
#19 0x00000000008d2c2a in plugin_init (argc=0x1318760, argv=0x275df70, flags=0) at sql_plugin.cc:1217
#20 0x00000000006c4b92 in init_server_components () at mysqld.cc:4132
#21 0x00000000006c5682 in main (argc=11, argv=0x7fff90d70178) at mysqld.cc:4643

Note that the initial test run was with --falcon-page-cache-size=1K however the recovery was with the default Falcon settings.

How to repeat:
The tablespace will be uploaded shortly.
[16 Feb 2009 10:38] Philip Stoev
The tablespace is at:

http://fedora10.dyndns.org/bug42874-2.zip
[5 Mar 2009 13:19] Vladislav Vaintroub
Primary bug for index corruption is Bug #42344