Bug #37344 Crash in IndexWalker::rebalanceDelete
Submitted: 11 Jun 2008 13:59 Modified: 30 Sep 2008 20:08
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0-falcon OS:Any
Assigned to: Ann Harrison CPU Architecture:Any
Triage: D1 (Critical)

[11 Jun 2008 13:59] Philip Stoev
Description:
When executing a DML workload, Falcon crashed as follows:

#0  0x00110416 in __kernel_vsyscall ()
#1  0x00581c78 in pthread_kill () from /lib/libpthread.so.0
#2  0x0843eda3 in write_core (sig=11) at stacktrace.c:302
#3  0x0829b228 in handle_segfault (sig=11) at mysqld.cc:2626
#4  <signal handler called>
#5  0x08524157 in IndexWalker::rebalanceDelete (this=0xb73999b0) at IndexWalker.cpp:272
#6  0x085245f3 in IndexWalker::rebalanceUpward (this=0xb73999b0, delta=1) at IndexWalker.cpp:481
#7  0x08524692 in IndexWalker::remove (this=0xb63d1400) at IndexWalker.cpp:430
#8  0x0852489a in IndexWalker::getNext (this=0xb7378550, lockForUpdate=false) at IndexWalker.cpp:89
#9  0x084aef38 in StorageDatabase::nextIndexed (this=0xb7201130, storageTable=0xb75dead0, indexWalker=0xb7378550, lockForUpdate=false)
    at StorageDatabase.cpp:481
#10 0x084b589c in StorageTable::nextIndexed (this=0xb75dead0, recordNumber=1268, lockForUpdate=false) at StorageTable.cpp:180
#11 0x084a8485 in StorageInterface::index_next (this=0xa5c54c8, buf=0xa5c5690 "Э\230\004") at ha_falcon.cpp:1585
#12 0x083c6059 in handler::read_range_next (this=0xa5c54c8) at handler.cc:4947
#13 0x083c418c in handler::multi_range_read_next (this=0xa5c54c8, range_info=0xa9c9bbf0) at handler.cc:4241
#14 0x083a3c7a in QUICK_RANGE_SELECT::get_next (this=0xa481830) at opt_range.cc:8508
#15 0x083be315 in rr_quick (info=0xa5849a4) at records.cc:298
#16 0x08315db8 in sub_select (join=0xa432dc8, join_tab=0xa584960, end_of_records=false) at sql_select.cc:13351
#17 0x08322d70 in do_select (join=0xa432dc8, fields=0xa434124, table=0x0, procedure=0x0) at sql_select.cc:13092
#18 0x08335655 in JOIN::exec (this=0xa432dc8) at sql_select.cc:2740
#19 0x083308f2 in mysql_select (thd=0xa4a9470, rref_pointer_array=0xa9d00bf4, tables=0xa9d01170, wild_num=0, fields=@0xa9d00b84, conds=0xa9d01480, og_num=1,
    order=0x0, group=0xa9d01610, having=0x0, proc_param=0x0, select_options=2416724480, result=0xa5d92b0, unit=0xa9d00c88, select_lex=0xa9d00af0)
    at sql_select.cc:2929
#20 0x0843e6f8 in mysql_derived_filling (thd=0xa4a9470, lex=0xa4aa540, orig_table_list=0xa9d01910) at sql_derived.cc:264
#21 0x0843e4a3 in mysql_handle_derived (lex=0xa4aa540, processor=0x843e528 <mysql_derived_filling(THD*, st_lex*, TABLE_LIST*)>) at sql_derived.cc:56
#22 0x082f890f in open_and_lock_tables_derived (thd=0xa4a9470, tables=0xa9d01910, derived=true) at sql_base.cc:4929
#23 0x082b7921 in open_and_lock_tables (thd=0xa4a9470, tables=0xa9d01910) at mysql_priv.h:1606
#24 0x082aa403 in execute_sqlcom_select (thd=0xa4a9470, all_tables=0xa9d01910) at sql_parse.cc:4789
#25 0x082ac0c9 in mysql_execute_command (thd=0xa4a9470) at sql_parse.cc:2018
#26 0x082b4ec6 in mysql_parse (thd=0xa4a9470,
    inBuf=0xa9d00500 "SELECT AVG( int_key ) FROM ( SELECT AVG( int_nokey ) FROM o AS X WHERE X . int_key < 44 GROUP BY int_key LIMIT 20 ) AS X WHERE X . int_key < 69 GROUP BY int_key", length=160, found_semicolon=0xa9c9d260) at sql_parse.cc:5782
#27 0x082b590f in dispatch_command (command=COM_QUERY, thd=0xa4a9470,
    packet=0xa4aae69 "SELECT AVG( int_key ) FROM ( SELECT AVG( int_nokey ) FROM o AS X WHERE X . int_key < 44 GROUP BY int_key LIMIT 20 ) AS X WHERE X . int_key < 69 GROUP BY int_key", packet_length=160) at sql_parse.cc:1059
#28 0x082b6b75 in do_command (thd=0xa4a9470) at sql_parse.cc:732
#29 0x082a4385 in handle_one_connection (arg=0xa4a9470) at sql_connect.cc:1134
#30 0x0057d32f in start_thread () from /lib/libpthread.so.0
#31 0x0049a27e in clone () from /lib/libc.so.6

The code is:

267
268     bool IndexWalker::rebalanceDelete()
269     {
270             if (balance > 1)
271                     {
272                     if (higher->balance < 0)
273                             {
274                             higher->rotateRight();
275                             rotateLeft();

The crash is because:

(gdb) print higher
$1 = (IndexWalker *) 0x0

How to repeat:
A simplifed test case will hopefully follow shortly.
[12 Jun 2008 10:29] Philip Stoev
Unsimplifed test case for bug 37344

Attachment: bug37344.test (application/octet-stream, text), 128.51 KiB.

[12 Jun 2008 10:36] Philip Stoev
Please find attached a non-simplifed test case for this bug. Let me know if a simpler test case is required to fix this bug and I will work on it.

Basically this crash happens on a statement of the form (either standalone or in a subquery):

SELECT AVG( int_nokey ) FROM E AS X WHERE X . int_nokey < 69 GROUP BY int_key LIMIT 1;

It is not sufficient to simply insert the data (the top of the test case) and then issue the SELECT (the last statement of the test case) -- at least some of the UPDATE operations (the middle of the test case) are also required.
[20 Jun 2008 21:15] Ann Harrison
In IndexWalker, the class member "balance" is not initialized.
[29 Jun 2008 11:44] Philip Stoev
I added  balance = 0; to IndexWalker::IndexWalker, however the crash still happened on a longer run. The existing test case no longer fails, so I will try to provide a new one.
[29 Jun 2008 13:09] Philip Stoev
The problem is that "higher" it not initialized:

(gdb) print higher
$2 = (IndexWalker *) 0x0
(gdb) print higher->balance
Cannot access memory at address 0x0
[29 Jun 2008 13:19] Philip Stoev
Grammar file for bug 37344

Attachment: bug37344.yy (text/plain), 1.54 KiB.

[29 Jun 2008 13:24] Philip Stoev
To reproduce the issue, please check out the mysql-test-extra-6.0 tree and then:

$ cd mysql-test-extra-6.0/mysql-test/gentest
$ ./runall.pl --basedir=/path/to/mysql-6.0-falcon --engine=Falcon --grammar=/location/of/bug37344.yy

This script will proceed to start a server and run randomly generated queries based on the grammar file. The crash will happen within 10 min after takeoff. Please ignore any errors reported by the test script itself -- it is not always able to generate semantically valid queries.

The grammar file may contain queries that are not related to the bug in question. Please let me know if a simplifed test case, or a test case in a different format is required.

It appears that the index must arrive to a certain state in order to crash, and it is very likely that any insert/update/delete mixture will arrive to a broken index eventually.
[29 Jun 2008 22:39] Ann Harrison
If higher were not initialized, it would be 0XCCCCCC - zero is 
a legitimate value for higher.  The code in question is rebalancing
an AVL tree - at the lowest levels, both higher and lower are zero.
That's not to say that there isn't a bug, just that having higher
be zero is not necessarily the cause.
[10 Jul 2008 18:49] Philip Stoev
New grammar for bug 37344

Attachment: bug37344.yy (application/octet-stream, text), 988 bytes.

[25 Jul 2008 18:11] Ann Harrison
Change pushed to the team tree seems to solve the problem
[27 Jul 2008 2:10] Kevin Lewis
Patch Approved, Code looks good.
[29 Jul 2008 11:37] Kevin Lewis
Pushed to mysql-6.0-falcon-team
[22 Aug 2008 21:15] Kevin Lewis
This is fixed in version 6.0.6
[30 Sep 2008 20:08] Jon Stephens
Documented as follows in the 6.0.6 changelog:

        A large number of updates on a Falcon table followed by a query of the
        form SELECT AVG(int_non_key_column) FROM table AS x WHERE
        int_non_key_column < constant GROUP BY int_key_column LIMIT limit
        could crash the server.