Bug #44212 Falcon infinite loop on SELECT after recovery
Submitted: 11 Apr 2009 9:22 Modified: 21 Feb 2014 18:04
Reporter: Philip Stoev Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0-falcon-team OS:Any
Assigned to: Assigned Account CPU Architecture:Any

[11 Apr 2009 9:22] Philip Stoev
Description:
After an apparently successfull recovery, Falcon was unable to complete the following query:

SELECT * FROM `test`.`table10_falcon_int_autoinc` FORCE INDEX (int_key) WHERE `int_key` >= -9223372036854775808 OR `int_key` IS NULL

within 12 hours, on a table containing less than 10K rows. It appears that Falcon was stuck in the following stack trace:

#0  IndexRootPage::scanIndex (dbb=0x2aaaab49e448, indexId=1, rootPage=7, lowKey=0x15e5960, highKey=0x2aaaab4ad680, searchFlags=1, transId=0,
    bitmap=0x2aaaab503e08) at IndexRootPage.cpp:427
#1  0x00000000009fd651 in Index::scanIndex (this=0x2aaaab522d40, lowKey=0x0, highKey=0x2aaaab4ad680, searchFlags=1, transaction=0x2aaaab492778,
    bitmap=0x2aaaab503e08) at Index.cpp:511
#2  0x000000000098b1f1 in StorageDatabase::indexScan (this=0x2aaaab4ed210, index=0x2aaaab522d40, lower=0x0, upper=0x2aaaab4ad678, searchFlags=1,
    storageConnection=0x2aaaab5220c8, bitmap=0x0) at StorageDatabase.cpp:834
#3  0x0000000000993248 in StorageTable::indexScan (this=0x2aaaab4aabb8, indexOrder=0) at StorageTable.cpp:277
#4  0x00000000009796fd in StorageInterface::scanRange (this=0x20677998, start_key=0x20677a78, end_key=0x20677a98, eqRange=true) at ha_falcon.cpp:1810
#5  0x0000000000979939 in StorageInterface::fillMrrBitmap (this=0x20677998) at ha_falcon.cpp:1973
#6  0x000000000097a9fd in StorageInterface::multi_range_read_init (this=0x20677998, seq=0x4f2c5d20, seq_init_param=0x204b4840, n_ranges=2, mode=4,
    buf=0x4f2c5d60) at ha_falcon.cpp:1951
#7  0x000000000080073b in QUICK_RANGE_SELECT::reset (this=0x204b4840) at opt_range.cc:8456
#8  0x000000000074b263 in join_init_read_record (tab=0x20653338) at sql_select.cc:17045
#9  0x000000000074e89a in sub_select (join=0x2064b5e8, join_tab=0x20653338, end_of_records=false) at sql_select.cc:16243
#10 0x000000000075c53d in do_select (join=0x2064b5e8, fields=0x2051db60, table=0x0, procedure=0x0) at sql_select.cc:15807
#11 0x0000000000779205 in JOIN::exec (this=0x2064b5e8) at sql_select.cc:2881
#12 0x0000000000773a73 in mysql_select (thd=0x2051bb70, rref_pointer_array=0x2051dc40, tables=0x20649980, wild_num=1, fields=@0x2051db60, conds=0x2064a4b8,
    og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147764736, result=0x2064a630, unit=0x2051d5f0, select_lex=0x2051da58)
    at sql_select.cc:3062
#13 0x0000000000779524 in handle_select (thd=0x2051bb70, lex=0x2051d550, result=0x2064a630, setup_tables_done_option=0) at sql_select.cc:314
#14 0x00000000006d35ad in execute_sqlcom_select (thd=0x2051bb70, all_tables=0x20649980) at sql_parse.cc:4768
#15 0x00000000006d461b in mysql_execute_command (thd=0x2051bb70) at sql_parse.cc:2069
#16 0x00000000006dc63a in mysql_parse (thd=0x2051bb70,
    inBuf=0x206495c8 "SELECT * FROM `test`.`table10_falcon_int_autoinc` FORCE INDEX (int_key) WHERE `int_key` >= -9223372036854775808 OR `int_key` IS NULL",
    length=132, found_semicolon=0x4f2c7f20) at sql_parse.cc:5783
#17 0x00000000006dd7cb in dispatch_command (command=COM_QUERY, thd=0x2051bb70, packet=0x20639871 "", packet_length=132) at sql_parse.cc:1009
#18 0x00000000006dec5e in do_command (thd=0x2051bb70) at sql_parse.cc:691
#19 0x00000000006cc062 in handle_one_connection (arg=0x2051bb70) at sql_connect.cc:1146
#20 0x0000003587a062f7 in start_thread () from /lib64/libpthread.so.0
#21 0x0000003586ed1b6d in clone () from /lib64/libc.so.6

The loop was like this:

466                     if (!page->nextPage)
(gdb)
425                                     {
(gdb)
427                                             {
(gdb)
429                                             UCHAR *q = node.key;
(gdb)
461                                     break;
(gdb)
463                             bitmap->set (number);
(gdb)

It appears that none of the conditions required to exit the loop was ever fullfilled.

How to repeat:
The tablespace will be uploaded shortly.
[11 Apr 2009 10:11] Philip Stoev
Replaying the recovery and the same queries did not cause the problem to appear again.

In addition, the core file from the original hung process was not saved due to a permissions problem on the test machine.