Bug #34735 Running iuds6.tst is hitting assertion
Submitted: 21 Feb 2008 16:22 Modified: 1 May 2008 20:35
Reporter: Hakan Küçükyılmaz Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version:6.0.4-p4 OS:Linux
Assigned to: Assigned Account CPU Architecture:Any
Tags: iuds6.tst, system-qa

[21 Feb 2008 16:22] Hakan Küçükyılmaz
Description:
I got a core file running debug build of mysqld release build on lu0009.mysql.com. We are hitting an assertion.

How to repeat:
Thread status:

- 64 threads where in pthread_cond_wait@@GLIBC_2.3.2 ()
-  9 threads where in __lll_mutex_lock_wait () from /lib64/libpthread.so.0
-  5 threads where in __gxx_personality_v0 () from /lib64/libc.so.6
-  3 threads where in pthread_cond_timedwait@@GLIBC_2.3.2 ()
-  1 thread was in do_sigwait () from /lib64/libpthread.so.0
-  1 thread was in fdatasync () from /lib64/libc.so.6
-  1 thread was in pthread_kill () from /lib64/libpthread.so.0

Backtrace of thread in pthread_kill()
(gdb) bt
#0  0x00002b81e64daea3 in pthread_kill () from /lib64/libpthread.so.0
#1  0x000000000065e64a in handle_segfault (sig=4) at mysqld.cc:2313
#2  <signal handler called>
#3  0x00002b81e64ddabd in raise () from /lib64/libpthread.so.0
#4  0x00000000008887ac in Error::debugBreak () at Error.cpp:92
#5  0x00000000008888a2 in Error::error (string=<value optimized out>) at Error.cpp:69
#6  0x0000000000888902 in Error::assertionFailed (fileName=0x4 <Address 0x4 out of
bounds>,
    line=8163) at Error.cpp:76
#7  0x0000000000855b4c in Table::insert (this=0x2aaaaad61b58, transaction=0x2aaaaef74da0,
    stream=0x0) at Table.cpp:2927
#8  0x00000000008486ce in StorageDatabase::insert (this=<value optimized out>,
    connection=<value optimized out>, table=0x2aaaaad61b58, stream=0x2aaaaec46198)
    at StorageDatabase.cpp:226
#9  0x000000000084c8b3 in StorageTable::insert (this=0x2aaaaec40c10) at
StorageTable.cpp:122
#10 0x00000000008458c4 in StorageInterface::write_row (this=0x1d8dcc0, buff=0x26e2a28
"��\023")
    at ha_falcon.cpp:1036
#11 0x000000000075b35b in ha_partition::write_row (this=0x26e2700, buf=<value optimized
out>)
    at ha_partition.cc:2763
#12 0x000000000075104b in handler::ha_write_row (this=0x26e2700, buf=0x26e2a28
"��\023")
    at handler.cc:4602
#13 0x00000000006e4efe in write_record (thd=0x2aaab03f2e90, table=0x27d1870,
info=0x49933ae0)
    at sql_insert.cc:1548
#14 0x00000000006ea7aa in mysql_insert (thd=0x2aaab03f2e90, table_list=<value optimized
out>,
    fields=@0x2055298, values_list=@0x20552e0, update_fields=@0x20552c8,
update_values=@0x20552b0,
    duplic=DUP_ERROR, ignore=false) at sql_insert.cc:803
#15 0x00000000006722c9 in mysql_execute_command (thd=0x2aaab03f2e90) at sql_parse.cc:2658
#16 0x00000000007ca7ce in sp_instr_stmt::exec_core (this=0x20563d0, thd=0x1fe3,
nextp=0x4)
    at sp_head.cc:2807
#17 0x00000000007ca9fa in sp_lex_keeper::reset_lex_and_exec_core (this=0x2056410,
    thd=0x2aaab03f2e90, nextp=0x49934d64, open_tables=false, instr=0x20563d0) at
sp_head.cc:2649
#18 0x00000000007d10cd in sp_instr_stmt::execute (this=0x20563d0, thd=0x2aaab03f2e90,
    nextp=0x49934d64) at sp_head.cc:2758
#19 0x00000000007cf13d in sp_head::execute (this=0x21d3bc0, thd=0x2aaab03f2e90) at
sp_head.cc:1195
#20 0x00000000007cfac2 in sp_head::execute_procedure (this=0x21d3bc0, thd=0x2aaab03f2e90,
    args=0x49934fc0) at sp_head.cc:1904
#21 0x00000000007e41cb in Event_job_data::execute (this=0x49935070, thd=0x2aaab03f2e90,
drop=false)
    at event_data_objects.cc:1987
#22 0x00000000007df6e6 in Event_worker_thread::run (this=0x499351b7, thd=0x2aaab03f2e90,
    event=0x2aaab2b46020) at event_scheduler.cc:312
#23 0x00000000007df802 in event_worker_thread (arg=<value optimized out>) at
event_scheduler.cc:263
#24 0x00002b81e64d6143 in start_thread () from /lib64/libpthread.so.0
#25 0x00002b81e6b0774d in clone () from /lib64/libc.so.6
#26 0x0000000000000000 in ?? ()

The assert in frame #7
#7  0x0000000000855b4c in Table::insert (this=0x2aaaaad61b58, transaction=0x2aaaaef74da0,
    stream=0x0) at Table.cpp:2927
2927    Table.cpp: No such file or directory.
        in Table.cpp

Table.cpp:2927 reads:

		transaction->addRecord(record);
		bool ret = insert(record, NULL, recordNumber);
		ASSERT(ret);
[22 Feb 2008 14:07] Kevin Lewis
>> Kevin Lewis wrote;
>> This assert was found by Hakan today, twice I in a row.
>>
>> 	uint Table::insert(Transaction *transaction, Stream *stream)
>> calls
>> 	bool Table::insert(Record * record, Record *prior, int recordNumber)
>>
>> like this, around line 2927;
>> 		bool ret = insert(record, NULL, recordNumber);
>> 		ASSERT(ret);
>>
>> This call to insert(Record *, Record *, int) can fail and return false.
>> So why does the other insert assume it will be true?  
>> Under what conditions could it return false?

> Jim Starkey wrote;
> The only thing that could cause it to fail was if the slot in the record
> tree was non-null.  All I can think of is that somebody backed out a
> record from record locator page and didn't take it out of the record
> tree.  On the other hand, we do this a zillion times a day.
[3 Mar 2008 5:22] Kevin Lewis
This is one of the asserts that I can hit when running falcon_bug_34351_C 10 times in a row, as reported in Bug#34990.
[8 Mar 2008 9:29] Hakan Küçükyılmaz
I moved the working directory of system-qa to the (faster)  RAID 10 disks and started iuds6.tst with latest mysql-6.0-falcon-team tree. Now I hit another assertion:

Program received signal SIGILL, Illegal instruction.
[Switching to Thread 1242380608 (LWP 7698)]
0x00002b7d3ba38abd in raise () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00002b7d3ba38abd in raise () from /lib64/libpthread.so.0
#1  0x000000000080fa4c in Error::error ()
#2  0x00000000007d7251 in Table::fetchNext ()
#3  0x00000000007c6482 in StorageDatabase::nextRow ()
#4  0x00000000007c1cd9 in StorageInterface::rnd_next ()
#5  0x0000000000728cd3 in ha_partition::rnd_next ()
#6  0x00000000007191b1 in rr_sequential ()
#7  0x00000000006c8e40 in mysql_delete ()
#8  0x0000000000644586 in mysql_execute_command ()
#9  0x00000000007839cc in sp_instr_stmt::exec_core ()
#10 0x00000000007863e7 in sp_lex_keeper::reset_lex_and_exec_core ()
#11 0x0000000000788782 in sp_instr_stmt::execute ()
#12 0x0000000000786c23 in sp_head::execute ()
#13 0x00000000007893a3 in sp_head::execute_procedure ()
#14 0x0000000000798d68 in Event_job_data::execute ()
#15 0x00000000007972a7 in Event_worker_thread::run ()
#16 0x0000000000797328 in event_worker_thread ()
#17 0x00002b7d3ba31143 in start_thread () from /lib64/libpthread.so.0
#18 0x00002b7d3c16674d in clone () from /lib64/libc.so.6
#19 0x0000000000000000 in ?? ()

I am going to run iuds6.tst with debug build to get a detailed backtrace
[12 Mar 2008 7:00] Hakan Küçükyılmaz
Running iuds6.tst with debug build shows the same assertion as in the first comment:

#0  0x00002aae99275abd in raise () from /lib64/libpthread.so.0
#1  0x000000000082e2ec in Error::debugBreak () at Error.cpp:92
#2  0x000000000082e3e2 in Error::error (string=<value optimized out>) at Error.cpp:69
#3  0x000000000082e442 in Error::assertionFailed (fileName=0x4 <Address 0x4 out of bounds>,
    line=25089) at Error.cpp:76
#4  0x00000000007fc93f in Table::insert (this=0x2aaaaad4fee8, transaction=0x2aaaab464070,
    stream=0x0) at Table.cpp:2960
#5  0x00000000007ec1ee in StorageDatabase::insert (this=<value optimized out>,
    connection=<value optimized out>, table=0x2aaaaad4fee8, stream=0x2aaaaf954700)
    at StorageDatabase.cpp:226
#6  0x00000000007f0413 in StorageTable::insert (this=0x2aaaaf94f178) at StorageTable.cpp:122
#7  0x00000000007e93f4 in StorageInterface::write_row (this=0x2aaaaf7488e0,
    buff=0x2aaaaf7a02b8 "�\023") at ha_falcon.cpp:1052
#8  0x0000000000723cd5 in handler::ha_write_row (this=0x2aaaaf7488e0, buf=0x2aaaaf7a02b8 "�\023")
    at handler.cc:4861
#9  0x000000000072de28 in ha_partition::write_row (this=0x2aaaaf79ff90, buf=<value optimized out>)
    at ha_partition.cc:2773
#10 0x0000000000723cd5 in handler::ha_write_row (this=0x2aaaaf79ff90, buf=0x2aaaaf7a02b8 "�\023")
    at handler.cc:4861
#11 0x00000000006b2d6b in write_record (thd=0x2aaab00b94d0, table=0x2aaab04b03a0,
    info=0x2aaaae955028) at sql_insert.cc:1554
#12 0x00000000006b3131 in select_insert::send_data (this=0x2aaaae954ff0, values=@0x2aaab00bb210)
    at sql_insert.cc:3052
#13 0x000000000068938e in end_send (join=0x2310450, join_tab=<value optimized out>,
    end_of_records=false) at sql_select.cc:14410

(gdb) f 4
#4  0x00000000007fc93f in Table::insert (this=0x2aaaaad4fee8, transaction=0x2aaaab464070,
    stream=0x0) at Table.cpp:2960
2960                    ASSERT(ret);
(gdb) p ret
$1 = false
[1 May 2008 20:13] Kevin Lewis
This is the same call stack as Bug#34602, which is now fixed.
[1 May 2008 20:19] Kevin Lewis
Status is Verified again.  This bug was created from the call stack originally put into #34602 on Feb 21.
[1 May 2008 20:35] Kevin Lewis
Back to Patch Queued again.  The assert here, found on Feb 21, was reproduced and fixed in Bug#34990.  The assert is;

uint Table::insert(Transaction *transaction, Stream *stream)
{ . . .
	bool ret = insert(record, NULL, recordNumber);
. . .
	ASSERT(ret);
. . .
}
The explanation from Bug#34990 is;  [5 Mar 21:17] Kevin Lewis
I reproduced the assert in Table::insert many times and found that the call to Section::insertStub() was returning the same record Number to two different threads that did an insert of a new record.  The second one looses.  

Section::insertStub() uses Table::syncInsert to protect a couple bitmaps it uses to decide which record slot is available.  But it would release this SyncObject during the search of the section page.  I made it hold the synObject for the whole function and the assert did not happen anymore.