Bug #39081 Falcon: segfault in StorageInterface::getDemographics()
Submitted: 27 Aug 2008 20:45 Modified: 15 May 2009 17:00
Reporter: Christopher Powers Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0-falcon, 6.0.7 OS:Any
Assigned to: Christopher Powers CPU Architecture:Any
Triage: Triaged: D1 (Critical)

[27 Aug 2008 20:45] Christopher Powers
Description:
The System QA online alter test caused a crash in StorageInterface::getDemographics(), as shown in this stack trace:

Program terminated with signal 11, Segmentation fault.
#0  0x006f7402 in __kernel_vsyscall ()
#0  0x006f7402 in __kernel_vsyscall ()
#1  0x0089f067 in pthread_kill () from /lib/libpthread.so.0
#2  0x087e2326 in my_write_core (sig=11) at stacktrace.c:307
#3  0x082ac890 in handle_segfault (sig=11) at mysqld.cc:2658
#4  <signal handler called>
#5  0x0852195e in StorageInterface::getDemographics (this=0xa9a8e00)at ha_falcon.cpp:698
#6  0x08521a66 in StorageInterface::info (this=0xa9a8e00, what=18) at ha_falcon.cpp:660
#7  0x0833e620 in make_join_statistics (join=0xa7016b98, tables=0xa9626e8,
    conds=0xa962c28, keyuse_array=0xa701807c) at sql_select.cc:3851
#8  0x08342790 in JOIN::optimize (this=0xa7016b98) at sql_select.cc:1564
#9  0x0834637b in mysql_select (thd=0xa8b5b60, rref_pointer_array=0xa8b717c, 
    tables=0xa9626e8, wild_num=0, fields=@0xa8b710c, conds=0xa962c28, 
    og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, 
    select_options=2148289024, result=0xa962d78, unit=0xa8b6de0, 
    select_lex=0xa8b7078) at sql_select.cc:3002
#10 0x0834b9ac in handle_select (thd=0xa8b5b60, lex=0xa8b6d84, 
    result=0xa962d78, setup_tables_done_option=0) at sql_select.cc:300
#11 0x082bc34d in execute_sqlcom_select (thd=0xa8b5b60, all_tables=0xa9626e8)at sql_parse.cc:4875
#12 0x082be11b in mysql_execute_command (thd=0xa8b5b60) at sql_parse.cc:2107
#13 0x082c6ead in mysql_parse (thd=0xa8b5b60, 
    inBuf=0xa962338 "SELECT datetime_key FROM B WHERE int_key < 2", length=44, 
    found_semicolon=0xa7e4f280) at sql_parse.cc:5845
#14 0x082c78b6 in dispatch_command (command=COM_QUERY, thd=0xa8b5b60, 
    packet=0xa90c4c9 "SELECT datetime_key FROM B WHERE int_key < 2", 
    packet_length=44) at sql_parse.cc:1120
#15 0x082c8b77 in do_command (thd=0xa8b5b60) at sql_parse.cc:807
#16 0x082b5269 in handle_one_connection (arg=0xa8b5b60) at sql_connect.cc:1153
#17 0x0089a45b in start_thread () from /lib/libpthread.so.0
#18 0x007f1c4e in clone () from /lib/libc.so.6

Complete log file (SWAN): http://clustra.norway.sun.com/~bteam/pb2/web.py?action=archive_download&archive_id=11700&p...

How to repeat:
System QA stress test, falcon_online_alter

Suggested fix:
The inner loop should use indexDesc->numberSegments as the max counter instead of key->key_parts, because the Falcon internal index information (StorageIndexDesc) may be temporarily out of sync with table->s->key_info.

for (uint n = 0; n < table->s->keys; ++n)
   {
   KEY *key = table->s->key_info + n;
   StorageIndexDesc *indexDesc = storageShare->getIndex(n);

   if (indexDesc)
      {
      ha_rows rows = 1 << indexDesc->numberSegments;
      
      for (uint segment = 0; segment < key->key_parts; ++segment, rows >>= 1)
         {
         ha_rows recordsPerSegment = (ha_rows)indexDesc->segmentRecordCounts[segment];
         key->rec_per_key[segment] = (ulong) MAX(recordsPerSegment, rows);
         }
      }
}
[27 Aug 2008 20:46] Christopher Powers
Patch committed to mysql-6.0-falcon-team: http://lists.mysql.com/commits/52779
[27 Aug 2008 21:03] Christopher Powers
Verified in pushbuild output
[16 Sep 2008 11:07] Alexey Stroganov
Server from 6.0.7 release crashes with following backtrace in sysbench read only tests:

#0  0x00002b8766e8e4c5 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000644dbe in handle_segfault (sig=11) at mysqld.cc:2660
#2  <signal handler called>
#3  0x0000000000827a12 in StorageInterface::info (this=0x1a95ce8, what=18) at ha_falcon.cpp:698
#4  0x00000000006b6d34 in make_join_statistics (join=0x1ab8768, tables=0x1abf350, conds=0x1ab85c8, keyuse_array=0x1aba100)
    at sql_select.cc:3852
#5  0x00000000006b96bb in JOIN::optimize (this=0x1ab8768) at sql_select.cc:1564
#6  0x00000000006bcca8 in mysql_select (thd=0x1aad8a0, rref_pointer_array=0x1abe850, tables=<value optimized out>,
    wild_num=<value optimized out>, fields=<value optimized out>, conds=0x1ab85c8, og_num=0, order=0x0, group=0x0,
    having=0x0, proc_param=0x0, select_options=0, result=0x1ac1190, unit=0x1abe208, select_lex=0x1abe668)
    at sql_select.cc:3003
#7  0x00000000006c276c in handle_select (thd=0x1aad8a0, lex=0x1abe168, result=0x1ac1190, setup_tables_done_option=0)
    at sql_select.cc:300
#8  0x0000000000651267 in execute_sqlcom_select (thd=0x1aad8a0, all_tables=0x1abf350) at sql_parse.cc:4962
#9  0x0000000000657c09 in mysql_execute_command (thd=0x1aad8a0) at sql_parse.cc:2167
#10 0x00000000006cde8d in Prepared_statement::execute (this=0x1abd610, expanded_query=<value optimized out>,
    open_cursor=false) at sql_prepare.cc:3571
#11 0x00000000006d0e3c in Prepared_statement::execute_loop (this=0x1abd610, expanded_query=0x4020e470, open_cursor=false,
    packet=<value optimized out>, packet_end=<value optimized out>) at sql_prepare.cc:3244
#12 0x00000000006d14e5 in mysql_stmt_execute (thd=0x1aad8a0, packet_arg=0x1ab0461 "\003", packet_length=2)
    at sql_prepare.cc:2465
#13 0x000000000065a34e in dispatch_command (command=COM_STMT_EXECUTE, thd=0x1aad8a0, packet=0x1ab0461 "\003",
    packet_length=29) at sql_parse.cc:1089
#14 0x000000000064dee2 in handle_one_connection (arg=<value optimized out>) at sql_connect.cc:1153
#15 0x00002b8766e8a193 in start_thread () from /lib64/libpthread.so.0
#16 0x00002b876780d45d in clone () from /lib64/libc.so.6
[16 Sep 2008 21:32] Alexey Stroganov
Observing the very similar backtrace for dbt2 test for 6.0.7 relese:

(gdb) bt
#0  0x00002b82a59494c5 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000644dbe in handle_segfault (sig=11) at mysqld.cc:2660
#2  <signal handler called>
#3  0x0000000000827a12 in StorageInterface::info (this=0x1a42888, what=18) at ha_falcon.cpp:698
#4  0x00000000006b6d34 in make_join_statistics (join=0x1b6cff8, tables=0x1a96230, conds=0x1a96dc0, keyuse_array=0x1b6e990)
    at sql_select.cc:3852
#5  0x00000000006b96bb in JOIN::optimize (this=0x1b6cff8) at sql_select.cc:1564
#6  0x00000000006bcca8 in mysql_select (thd=0x1a8b580, rref_pointer_array=0x1a8d668, tables=<value optimized out>,
    wild_num=<value optimized out>, fields=<value optimized out>, conds=0x1a96dc0, og_num=1, order=0x1a972f0, group=0x0,
    having=0x0, proc_param=0x0, select_options=0, result=0x1a973b8, unit=0x1a8d020, select_lex=0x1a8d480)
    at sql_select.cc:3003
#7  0x00000000006c276c in handle_select (thd=0x1a8b580, lex=0x1a8cf80, result=0x1a973b8, setup_tables_done_option=0)
    at sql_select.cc:300
#8  0x0000000000651267 in execute_sqlcom_select (thd=0x1a8b580, all_tables=0x1a96230) at sql_parse.cc:4962
#9  0x0000000000657c09 in mysql_execute_command (thd=0x1a8b580) at sql_parse.cc:2167
#10 0x0000000000659e5f in mysql_parse (thd=0x1a8b580,
    inBuf=0x1a95f98 "SELECT c_id\nFROM customer\nWHERE c_w_id = 1\n  AND c_d_id = 5\n  AND c_last = 'ESEBARESE'\nORDER BY c_first ASC", length=107, found_semicolon=0x460ce130) at sql_parse.cc:5932
#11 0x000000000065a9de in dispatch_command (command=COM_QUERY, thd=0x1a8b580,
    packet=0x1a8df61 "SELECT c_id\nFROM customer\nWHERE c_w_id = 1\n  AND c_d_id = 5\n  AND c_last = 'ESEBARESE'\nORDER BY c_first ASC", packet_length=<value optimized out>) at sql_parse.cc:1134
#12 0x000000000064dee2 in handle_one_connection (arg=<value optimized out>) at sql_connect.cc:1153
#13 0x00002b82a5945193 in start_thread () from /lib64/libpthread.so.0
#14 0x00002b82a62c845d in clone () from /lib64/libc.so.6
#15 0x0000000000000000 in ?? ()
[4 Feb 2009 13:13] Kevin Lewis
This was moved from Documenting to Need Feedback, but I cannot find a question.  Moving back to Documenting
[22 Feb 2009 19:06] Christopher Powers
Falcon maintains a mapping of Falcon indexes to server indexes. Concurrent online ALTER operations may cause this mapping to briefly go out of sync, possibly resulting in the crash represented in the first stack trace. This has since been resolved.

The last two stack traces are unrelated to the problem addressed in this bug.
[15 May 2009 13:20] MC Brown
It's not clear whether this bug has been fixed, and if it has been, which version it ended up in. We need the three-part version number.
[15 May 2009 17:00] MC Brown
A note has been added to the 6.0.8 changelog: 

When performing online ALTER operations that change the indexes on Falcon tables, the indexes could get out of synchronization, leading to a crash.