Bug #34627 Falcon potential deadlock in Bdb::addRef
Submitted: 17 Feb 2008 16:21 Modified: 21 May 2008 14:10
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version:6.0.4 OS:Any
Assigned to: Philip Stoev CPU Architecture:Any
Tags: compiler_suspected, simpler_testcase_needed
Triage: D1 (Critical)

[17 Feb 2008 16:21] Philip Stoev
Description:
Hello,

When running the iuds2 test against the latest 6.0.4 binary, I observed several dozen clients hanging in that SHOW PROCESSLIST situation:

     Id: 25351
   User: root
   Host: localhost
     db: systest1
Command: Query
   Time: 2833 # This value is in seconds
  State: update
   Info: INSERT INTO tb0_logs (dt1, entry_dsc, f4)
VALUES (NOW(), CONCAT('Insert row ', NEW.f1,' ',
NEW.f2, '     # This is a query contained in a trigger

I broke in with gdb and obtained the following stack trace from one of those threads:

#0  0x0000003ba880a4a6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000008abc94 in Synchronize::sleep (this=0x12d60740) at Synchronize.cpp:119
#2  0x0000000000822af2 in SyncObject::wait (this=0x2aaaaae24b48, type=None, thread=0x12d60740, sync=0xffffffffffffffff, timeout=0) at SyncObject.cpp:406
#3  0x0000000000822748 in SyncObject::lock (this=0x2aaaaae24b48, sync=0x0, type=Exclusive, timeout=0) at SyncObject.cpp:259
#4  0x00000000008b2cfd in Bdb::addRef (this=0x2aaaaae24ad8, lType=Exclusive) at BDB.cpp:113
#5  0x00000000008b4708 in Cache::fetchPage (this=0x2aaaaadc4fa8, dbb=0x2aaaaafe7e10, pageNumber=33961, pageType=PAGE_any, lockType=Exclusive)
    at Cache.cpp:230
#6  0x000000000084e773 in Dbb::fetchPage (this=0xfffffffffffffffc, pageNumber=0, pageType=78791, lockType=4294967295) at Dbb.cpp:219
#7  0x0000000000893964 in Section::getSectionPage (this=0x2aaaab0320e0, sequence=14, lockType=Exclusive, transId=3691700) at Section.cpp:1412
#8  0x000000000089177b in Section::insertStub (this=0x2aaaab0320e0, transId=3691700) at Section.cpp:351
#9  0x000000000084e953 in Dbb::insertStub (this=0x12d6075c, section=0x12d6075c, transaction=0x133c7) at Dbb.cpp:304
#10 0x0000000000829e20 in Table::insert (this=0x2aaaaaff0a50, transaction=0x2aaaae81a528, stream=0x2aaaacc0a390) at Table.cpp:2906
#11 0x0000000000819e5c in StorageDatabase::insert (this=0x12d6075c, connection=Internal: global symbol `Connection' found in template_instnt.cpp psymtab but not in symtab.
Connection may be an inlined function, or may be a template function
(if a template, try specifying an instantiation: Connection<type>).
) at StorageDatabase.cpp:226
#12 0x000000000081e706 in StorageTable::insert (this=0x2aaaacc04e08) at StorageTable.cpp:122
#13 0x000000000080fe18 in StorageInterface::write_row (this=0x2aaabcf7fcd0, buff=0x2aaabcf7ff70 "ЫжЕ╘") at ha_falcon.cpp:1036
#14 0x000000000073ae16 in handler::ha_write_row (this=0x2aaabcf7fcd0, buf=0x2aaabcf7ff70 "ЫжЕ╘") at handler.cc:4602
#15 0x00000000006ce2dc in write_record (thd=0x2aaab0b14450, table=0x2aaabc087d10, info=0x4704f750) at sql_insert.cc:1548
#16 0x00000000006cce96 in mysql_insert (thd=0x2aaab0b14450, table_list=0x2aaabc140130, fields=@0x2aaabc13fb00, values_list=@0x2aaabc13fb48,
    update_fields=@0x2aaabc13fb30, update_values=@0x2aaabc13fb18, duplic=DUP_ERROR, ignore=false) at sql_insert.cc:803
#17 0x000000000065e52d in mysql_execute_command (thd=0x2aaab0b14450) at sql_parse.cc:2656
#18 0x00000000007aa748 in sp_instr_stmt::exec_core (this=0x2aaabc5d0580, thd=0x12d6075c, nextp=0x4705038c) at sp_head.cc:2807
#19 0x00000000007aa39a in sp_lex_keeper::reset_lex_and_exec_core (this=0x2aaabc5d05c0, thd=0x2aaab0b14450, nextp=0x4705038c, open_tables=false,
    instr=0x2aaabc5d0580) at sp_head.cc:2649
#20 0x00000000007aa60d in sp_instr_stmt::execute (this=0x2aaabc5d0580, thd=0x2aaab0b14450, nextp=0x4705038c) at sp_head.cc:2758
#21 0x00000000007a7013 in sp_head::execute (this=0x2aaabc13e8c0, thd=0x2aaab0b14450) at sp_head.cc:1195
#22 0x00000000007a7842 in sp_head::execute_trigger (this=0x2aaabc13e8c0, thd=0x2aaab0b14450, db_name=0x1, table_name=0x9399840, grant_info=0x0)
    at sp_head.cc:1489
#23 0x00000000007b7efd in Table_triggers_list::process_triggers (this=0x2aaabc1a1880, thd=0x2aaab0b14450, event=TRG_EVENT_INSERT,
    time_type=TRG_ACTION_AFTER, old_row_is_record1=255) at sql_trigger.cc:1963
#24 0x00000000006ce381 in write_record (thd=0x2aaab0b14450, table=0x2aaabc024200, info=0x47050a70) at sql_insert.cc:1560
#25 0x00000000006cce96 in mysql_insert (thd=0x2aaab0b14450, table_list=0x2aaab017a3b8, fields=@0x2aaab0759170, values_list=@0x2aaab07591b8,
    update_fields=@0x2aaab07591a0, update_values=@0x2aaab0759188, duplic=DUP_ERROR, ignore=false) at sql_insert.cc:803
#26 0x000000000065e52d in mysql_execute_command (thd=0x2aaab0b14450) at sql_parse.cc:2656
#27 0x00000000007aa748 in sp_instr_stmt::exec_core (this=0x2aaab075aa78, thd=0x12d6075c, nextp=0x470516ac) at sp_head.cc:2807
#28 0x00000000007aa39a in sp_lex_keeper::reset_lex_and_exec_core (this=0x2aaab075aab8, thd=0x2aaab0b14450, nextp=0x470516ac, open_tables=false,
    instr=0x2aaab075aa78) at sp_head.cc:2649
#29 0x00000000007aa60d in sp_instr_stmt::execute (this=0x2aaab075aa78, thd=0x2aaab0b14450, nextp=0x470516ac) at sp_head.cc:2758
#30 0x00000000007a7013 in sp_head::execute (this=0x2aaab0b16920, thd=0x2aaab0b14450) at sp_head.cc:1195
#31 0x00000000007a868f in sp_head::execute_procedure (this=0x2aaab0b16920, thd=0x2aaab0b14450, args=0x2aaab0b16340) at sp_head.cc:1904
#32 0x0000000000660702 in mysql_execute_command (thd=0x2aaab0b14450) at sql_parse.cc:3858
#33 0x000000000066317e in mysql_parse (thd=0x2aaab0b14450, inBuf=0xa64fd10 "CALL ins_tb0_eng1 (@tmp_num, @tmp_word)", length=39, found_semicolon=0x47052590)
    at sql_parse.cc:5410
#34 0x000000000065b895 in dispatch_command (command=COM_QUERY, thd=0x2aaab0b14450, packet=0xa64fd37 "", packet_length=39) at sql_parse.cc:948
#35 0x000000000065b092 in do_command (thd=0x2aaab0b14450) at sql_parse.cc:697
#36 0x0000000000659e4f in handle_one_connection (arg=0x12d6075c) at sql_connect.cc:1146
#37 0x0000003ba88062f7 in start_thread () from /lib64/libpthread.so.0
#38 0x0000003ba80ce85d in clone () from /lib64/libc.so.6

Some observations:

* CPU usage is around 150%
* Strace shows no I/O being done on the tablespace or the serial log
* This line looks weird: 
SyncObject::lock (this=0x2aaaaae24b48, sync=0x0, type=Exclusive, timeout=0) at SyncObject.cpp:259 -- is sync allowed to be NULL?
* The stalled threads debug output lists the stalled threads, however does not list any stalled synchronization objects.

How to repeat:
Core dump will be provided tomorrow - I will let the server alone for the night to see if the queries will complete.
[17 Feb 2008 16:37] Philip Stoev
CPU usage is now zero % (probably all background threads completed their work at this point). Another tread was locked like this:

#0  0x0000003ba880c999 in __lll_mutex_unlock_wake () from /lib64/libpthread.so.0
#1  0x0000003ba8809a69 in _L_mutex_unlock_59 () from /lib64/libpthread.so.0
#2  0x0000003ba880972b in __pthread_mutex_unlock_usercnt () from /lib64/libpthread.so.0
#3  0x00000000008abded in Synchronize::sleep (this=0x1380ded8, milliseconds=10000) at Synchronize.cpp:192
#4  0x0000000000822aa0 in SyncObject::wait (this=0x2aaaab032170, type=Exclusive, thread=0x1380ded8, sync=0xffffffffffffffff, timeout=0) at SyncObject.cpp:388
#5  0x0000000000822748 in SyncObject::lock (this=0x2aaaab032170, sync=0x4886efc0, type=Exclusive, timeout=0) at SyncObject.cpp:259
#6  0x00000000008221e7 in Sync::lock (this=0x4886efc0, type=Exclusive) at Sync.cpp:58
#7  0x0000000000891ae6 in Section::insertStub (this=0x2aaaab0320e0, transId=3911470) at Section.cpp:468
#8  0x000000000084e953 in Dbb::insertStub (this=0x1380df20, section=0x1380df20, transaction=0x3ba880fb48) at Dbb.cpp:304
#9  0x0000000000829e20 in Table::insert (this=0x2aaaaaff0a50, transaction=0x2aaaaee3d510, stream=0x2aaac2482ee8) at Table.cpp:2906
#10 0x0000000000819e5c in StorageDatabase::insert (this=0x1380df20, connection=Internal: global symbol `Connection' found in yassl_int.cpp psymtab but not in symtab.
Connection may be an inlined function, or may be a template function
(if a template, try specifying an instantiation: Connection<type>).
) at StorageDatabase.cpp:226
#11 0x000000000081e706 in StorageTable::insert (this=0x2aaac247d960) at StorageTable.cpp:122
#12 0x000000000080fe18 in StorageInterface::write_row (this=0x92d1c60, buff=0x92d1f00 "Ы\006Ф╘") at ha_falcon.cpp:1036
#13 0x000000000073ae16 in handler::ha_write_row (this=0x92d1c60, buf=0x92d1f00 "Ы\006Ф╘") at handler.cc:4602
#14 0x00000000006ce2dc in write_record (thd=0x2aaad2cc1f90, table=0x96c6290, info=0x4886f3e0) at sql_insert.cc:1548
#15 0x00000000006cce96 in mysql_insert (thd=0x2aaad2cc1f90, table_list=0xa6fad20, fields=@0xa6fa6f0, values_list=@0xa6fa738, update_fields=@0xa6fa720,
    update_values=@0xa6fa708, duplic=DUP_ERROR, ignore=false) at sql_insert.cc:803
#16 0x000000000065e52d in mysql_execute_command (thd=0x2aaad2cc1f90) at sql_parse.cc:2656
#17 0x00000000007aa748 in sp_instr_stmt::exec_core (this=0xa5f4d08, thd=0x1380df20, nextp=0x4887001c) at sp_head.cc:2807
#18 0x00000000007aa39a in sp_lex_keeper::reset_lex_and_exec_core (this=0xa5f4d48, thd=0x2aaad2cc1f90, nextp=0x4887001c, open_tables=false, instr=0xa5f4d08)
    at sp_head.cc:2649
#19 0x00000000007aa60d in sp_instr_stmt::execute (this=0xa5f4d08, thd=0x2aaad2cc1f90, nextp=0x4887001c) at sp_head.cc:2758
#20 0x00000000007a7013 in sp_head::execute (this=0xa6f94b0, thd=0x2aaad2cc1f90) at sp_head.cc:1195
#21 0x00000000007a7842 in sp_head::execute_trigger (this=0xa6f94b0, thd=0x2aaad2cc1f90, db_name=0x1, table_name=0x9399840, grant_info=0x0) at sp_head.cc:1489
#22 0x00000000007b7efd in Table_triggers_list::process_triggers (this=0x94b15e0, thd=0x2aaad2cc1f90, event=TRG_EVENT_DELETE, time_type=TRG_ACTION_AFTER,
    old_row_is_record1=255) at sql_trigger.cc:1963
#23 0x00000000006e0b1a in mysql_delete (thd=0x2aaad2cc1f90, table_list=0xab885e8, conds=0xab88b10, order=0x2aaad2cc3b30, limit=18446744073709551615,
    options=0, reset_auto_increment=false) at sql_delete.cc:284
#24 0x000000000065e859 in mysql_execute_command (thd=0x2aaad2cc1f90) at sql_parse.cc:2785
#25 0x000000000066317e in mysql_parse (thd=0x2aaad2cc1f90, inBuf=0xab884e0 "delete from tb0_eng1 where f1=@tmp_num", length=38, found_semicolon=0x48871590)
    at sql_parse.cc:5410
#26 0x000000000065b895 in dispatch_command (command=COM_QUERY, thd=0x2aaad2cc1f90, packet=0xab88506 "", packet_length=38) at sql_parse.cc:948
#27 0x000000000065b092 in do_command (thd=0x2aaad2cc1f90) at sql_parse.cc:697
#28 0x0000000000659e4f in handle_one_connection (arg=0x1380df20) at sql_connect.cc:1146
#29 0x0000003ba88062f7 in start_thread () from /lib64/libpthread.so.0
#30 0x0000003ba80ce85d in clone () from /lib64/libc.so.6
[20 Feb 2008 20:30] Kevin Lewis
Philip, Please re-verify withthe current build. not using GCC3.2.3
[12 May 2008 19:40] Kevin Lewis
Philip, is this still a problem?  Or should it be closed?
[21 May 2008 14:10] Philip Stoev
This bug is no longer repeatable with the latest compiler we are using.