MySQL Bugs: #33724: Falcon crash on recovery with huge log and tablespace

Bug #33724	Falcon crash on recovery with huge log and tablespace
Submitted:	7 Jan 2008 16:10	Modified:	31 Jan 2008 16:32
Reporter:	Philip Stoev	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: Falcon storage engine	Severity:	S1 (Critical)
Version:	6.0.4-p2	OS:	Any
Assigned to:	Jim Starkey	CPU Architecture:	Any

Description:
When running the iuds6 test, Falcon crashed with the following stack trace:

#0 0xffffe410 in __kernel_vsyscall () 
#1 0x0060ef57 in pthread_kill () from /lib/libpthread.so.0 
#2 0x08355acf in write_core (sig=10738) at stacktrace.c:240 
#3 0x0824a265 in handle_segfault () 
#4 
#5 0xffffe410 in __kernel_vsyscall () 
#6 0x003e4c00 in raise () from /lib/libc.so.6 
#7 0x003e6451 in abort () from /lib/libc.so.6 
#8 0xf7fc5277 in __cxa_call_unexpected () from /usr/lib/libstdc++.so.5 
#9 0xf7fc52c4 in std::terminate () from /usr/lib/libstdc++.so.5 
#10 0xf7fc549c in __cxa_rethrow () from /usr/lib/libstdc++.so.5 
#11 0x083cfab4 in Thread::thread () 
#12 0x083cf7e0 in Thread::thread () 
#13 0x0060a2db in start_thread () from /lib/libpthread.so.0 
#14 0x0048914e in clone () from /lib/libc.so.6 

This crash left a 2gig core file and the following sizes of tablespace files, even though the test is not supposed to create millions of records:

-rw-rw---- 1 qauser qauser   39840768 Jan  7 14:12 falcon_master.fl1
-rw-rw---- 1 qauser qauser 2169584128 Jan  7 16:17 falcon_master.fl2
-rw-rw---- 1 qauser qauser    1372160 Jan  7 17:00 falcon_master.fts
-rw-rw---- 1 qauser qauser    1093632 Jan  7 16:59 falcon_temporary.fts
-rw-rw---- 1 qauser qauser  568881152 Jan  7 17:00 falcon_user.fts

Running mysqld on that tablespace with --falcon-debug-mask=65535 resulted in following lines in STDERR:

followed by a crash with the following backtrace:

0x83fce71 Error::error(char const*, ...) + 73
0x8472ea5 Cache::fetchPage(Dbb*, int, PageType, LockType) + 441
0x83f5d22 Dbb::handoffPage(Bdb*, int, PageType, LockType) + 34
0x844a9f2 Section::getSectionPage(Dbb*, int, int, LockType, unsigned int) + 554
0x844ce28 Section::getSectionPage(int, LockType, unsigned int) + 184
0x844b6c1 Section::fetchLocatorPage(int, int, LockType, unsigned int) + 65
0x844b199 Section::reInsertStub(int, unsigned int) + 37
0x83f5f1d Dbb::reInsertStub(int, int, unsigned int) + 141
0x8446abf SRLUpdateRecords::redo() + 403
0x844fed2 SerialLog::recover() + 1230
0x83ec68b Database::openDatabase(char const*) + 367
0x83e8fc1 Connection::getDatabase(char const*, char const*, Threads*) + 193
0x83e68d6 Connection::openDatabase(char const*, char const*, char const*, char const*, char const*, Threads*) + 74
0x83bb3cb StorageDatabase::getOpenConnection() + 63
0x83bf170 StorageHandler::initialize() + 108
0x83bebda StorageHandler::getStorageConnection(StorageTableShare*, THD*, int, OpenOption) + 426
0x83b158a StorageInterface::open(char const*, int, unsigned int) + 98
0x830abfa handler::ha_open(st_table*, char const*, int, int) + 34
0x8312955 ha_partition::open(char const*, int, unsigned int) + 405
0x830abfa handler::ha_open(st_table*, char const*, int, int) + 34
0x828b39b open_table_from_share(THD*, st_table_share*, char const*, unsigned int, unsigned int, unsigned int, st_table*, open_table_mode) + 783
0x8288600 open_unireg_entry(THD*, st_table*, TABLE_LIST*, char const*, char*, unsigned int, st_mem_root*, unsigned int) + 124
0x8283162 open_table(THD*, TABLE_LIST*, st_mem_root*, bool*, unsigned int) + 1954
0x82843cf open_tables(THD*, TABLE_LIST**, unsigned int*, unsigned int) + 1103
0x8284719 open_normal_and_derived_tables(THD*, TABLE_LIST*, unsigned int) + 25
0x8330a5f get_all_tables(THD*, TABLE_LIST*, Item*) + 1367
0x83386fa get_schema_tables_result(JOIN*, enum_schema_table_state) + 370
0x8296b00 JOIN::exec() + 6492
0x8296d93 _Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select + 339
0x82920ab handle_select(THD*, st_lex*, select_result*, unsigned long) + 307
0x825df7d execute_sqlcom_select(THD*, TABLE_LIST*) + 149
0x825543b mysql_execute_command(THD*) + 747
0x825b5ad mysql_parse(THD*, char const*, unsigned int, char const**) + 269
0x82540c5 dispatch_command(enum_server_command, THD*, char*, unsigned int) + 909
0x8253d08 do_command(THD*) + 176
0x8252557 handle_one_connection + 287

How to repeat:
Try running iuds6 scenario of the systems test using an all-falcon iuds6.tst configuration file. More instructions are available at:

https://inside.mysql.com/wiki/SystemQATestPlan

STDERR is:

Recovering database /data1/6.0.4/systest_vardir-p3/master-data/falcon_master.fts ...
first recovery block is 282998
last recovery block is 1858544
recovery read block is 283917

followed by a numerous

RecordLocatorPage: transient corruption corrected

and finally:

Bugcheck: page 16814 wrong page type, expected 2 got 7

080107 17:00:26 - mysqld got signal 4;

Second recovery attempt crashed with no stack trace.

Third recovery attempt crashed with:

0x84832e9 SRLRecordLocator::pass2() + 101
0x844fe4a SerialLog::recover() + 1094
0x83ec68b Database::openDatabase(char const*) + 367
0x83e8fc1 Connection::getDatabase(char const*, char const*, Threads*) + 193
0x83e68d6 Connection::openDatabase(char const*, char const*, char const*, char const*, char const*, Threads*) + 74
0x83bb3cb StorageDatabase::getOpenConnection() + 63
0x83bf170 StorageHandler::initialize() + 108
0x83bebda StorageHandler::getStorageConnection(StorageTableShare*, THD*, int, OpenOption) + 426
0x83b158a StorageInterface::open(char const*, int, unsigned int) + 98
0x830abfa handler::ha_open(st_table*, char const*, int, int) + 34
0x8312955 ha_partition::open(char const*, int, unsigned int) + 405
0x830abfa handler::ha_open(st_table*, char const*, int, int) + 34
0x828b39b open_table_from_share(THD*, st_table_share*, char const*, unsigned int, unsigned int, unsigned int, st_table*, open_table_mode) + 783
0x8288600 open_unireg_entry(THD*, st_table*, TABLE_LIST*, char const*, char*, unsigned int, st_mem_root*, unsigned int) + 124
0x8283162 open_table(THD*, TABLE_LIST*, st_mem_root*, bool*, unsigned int) + 1954
0x82843cf open_tables(THD*, TABLE_LIST**, unsigned int*, unsigned int) + 1103
0x8284719 open_normal_and_derived_tables(THD*, TABLE_LIST*, unsigned int) + 25
0x8330a5f get_all_tables(THD*, TABLE_LIST*, Item*) + 1367
0x83386fa get_schema_tables_result(JOIN*, enum_schema_table_state) + 370
0x8296b00 JOIN::exec() + 6492
0x8296d93 _Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select + 339
0x82920ab handle_select(THD*, st_lex*, select_result*, unsigned long) + 307
0x825df7d execute_sqlcom_select(THD*, TABLE_LIST*) + 149
0x825543b mysql_execute_command(THD*) + 747
0x825b5ad mysql_parse(THD*, char const*, unsigned int, char const**) + 269
0x82540c5 dispatch_command(enum_server_command, THD*, char*, unsigned int) + 909
0x8253d08 do_command(THD*) + 176
0x8252557 handle_one_connection + 287

Setting this to verified, since the corrupted tablespace is available for this bug. Can you please triage accordingly?

This has the signature of the double recovery bug (better named "premature
shutdown" bug), but it's impossible to know for sure.

If the bug reappears, please re-open it.