Bug #40391 Maria deadlock in trnman_get_min_trid , trnman_end_trn
Submitted: 29 Oct 2008 10:35 Modified: 8 Jan 2009 10:39
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Maria storage engine Severity:S1 (Critical)
Version:6.0-bzr OS:Any
Assigned to: Michael Widenius CPU Architecture:Any

[29 Oct 2008 10:35] Philip Stoev
Description:
When running a concurrent DML workload, maria deadlocked with most threads blocked open_table() or another table open/close function and:

A. One thread blocked in trnman_get_min_trid():

Thread 97 (Thread 1278904640 (LWP 23712)):
#0  0x0000003ba880c888 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003ba8808921 in _L_mutex_lock_1133 () from /lib64/libpthread.so.0
#2  0x0000003ba880887a in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000c343d3 in safe_mutex_lock (mp=0x1552bc0, try_lock=0 '\0', file=0xe51a00 "trnman.c", line=805) at thr_mutex.c:153
#4  0x0000000000b47f34 in trnman_get_min_trid () at trnman.c:805
#5  0x0000000000b8a09c in _ma_remove_not_visible_states (org_history=0x2aaab09aada8, all=1 '\001', trnman_is_locked=0 '\0') at ma_state.c:169
#6  0x0000000000bba196 in maria_close (info=0x2aaab11475c8) at ma_close.c:140
#7  0x0000000000b6d58f in ha_maria::close (this=0x2aaab0db1c80) at ha_maria.cc:946
#8  0x000000000072f335 in closefrm (table=0x2aaab0ec6648, free_share=true) at table.cc:2110
#9  0x000000000071c5ef in intern_close_table (table=0x2aaab0ec6648) at sql_base.cc:783
#10 0x000000000071c636 in free_cache_entry (table=0x2aaab0ec6648) at sql_base.cc:805
#11 0x000000000071f7dd in open_table (thd=0x2aaab515c938, table_list=0x2aaabd18ac40, mem_root=0x4c3a4c80, action=0x4c3a4cfc, flags=0) at sql_base.cc:2702
#12 0x000000000072064f in open_tables (thd=0x2aaab515c938, start=0x4c3a4d60, counter=0x4c3a4d94, flags=0) at sql_base.cc:3704
#13 0x0000000000720df4 in open_and_lock_tables_derived (thd=0x2aaab515c938, tables=0x2aaabc072700, derived=true, flags=0) at sql_base.cc:4158
#14 0x00000000006d7a51 in open_and_lock_tables (thd=0x2aaab515c938, tables=0x2aaabc072700) at ../mysql_priv.h:1620
#15 0x0000000000785a79 in mysql_delete (thd=0x2aaab515c938, table_list=0x2aaabc072700, conds=0x2aaabc072d68, order=0x2aaabc071e08,
    limit=18446744073709551615, options=0, reset_auto_increment=false) at sql_delete.cc:54
#16 0x00000000006cf727 in mysql_execute_command (thd=0x2aaab515c938) at sql_parse.cc:3137
#17 0x000000000089952b in sp_instr_stmt::exec_core (this=0x2aaabc6bbfc0, thd=0x2aaab515c938, nextp=0x4c3a6568) at sp_head.cc:2910
#18 0x0000000000899e7a in sp_lex_keeper::reset_lex_and_exec_core (this=0x2aaabc6bc000, thd=0x2aaab515c938, nextp=0x4c3a6568, open_tables=false,
    instr=0x2aaabc6bbfc0) at sp_head.cc:2734
#19 0x000000000089a48b in sp_instr_stmt::execute (this=0x2aaabc6bbfc0, thd=0x2aaab515c938, nextp=0x4c3a6568) at sp_head.cc:2847
#20 0x000000000089c5db in sp_head::execute (this=0x2aaabd4d4f40, thd=0x2aaab515c938) at sp_head.cc:1241
#21 0x000000000089d374 in sp_head::execute_procedure (this=0x2aaabd4d4f40, thd=0x2aaab515c938, args=0x2aaab515ee10) at sp_head.cc:1972
#22 0x00000000006d2bc9 in mysql_execute_command (thd=0x2aaab515c938) at sql_parse.cc:4131
#23 0x00000000006d446b in mysql_parse (thd=0x2aaab515c938, inBuf=0x2aaac0606770 "CALL test.insdel2()", length=19, found_semicolon=0x4c3a7f20)
    at sql_parse.cc:5634
#24 0x00000000006d564d in dispatch_command (command=COM_QUERY, thd=0x2aaab515c938, packet=0x2aaab4e40c09 "CALL test.insdel2()", packet_length=19)
    at sql_parse.cc:1009
#25 0x00000000006d6b94 in do_command (thd=0x2aaab515c938) at sql_parse.cc:689
#26 0x00000000006c5293 in handle_one_connection (arg=0x2aaab515c938) at sql_connect.cc:1156
#27 0x0000003ba88062f7 in start_thread () from /lib64/libpthread.so.0
#28 0x0000003ba80ce85d in clone () from /lib64/libc.so.6

B. Two threads blocked in trnman_end_trn()

Thread 92 (Thread 1293547840 (LWP 24629)):
#0  0x0000003ba880c888 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003ba8808921 in _L_mutex_lock_1133 () from /lib64/libpthread.so.0
#2  0x0000003ba880887a in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000c343d3 in safe_mutex_lock (mp=0x1552bc0, try_lock=0 '\0', file=0xe51a00 "trnman.c", line=387) at thr_mutex.c:153
#4  0x0000000000b46ff3 in trnman_end_trn (trn=0x2aaab0235d78, commit=0 '\0') at trnman.c:387
#5  0x0000000000bf43f4 in ma_commit (trn=0x2aaab0235d78) at ma_commit.c:37
#6  0x0000000000b6b213 in ha_maria::external_lock (this=0x2aaab0a0dff0, thd=0x2aaac4b55bb8, lock_type=2) at ha_maria.cc:2273
#7  0x0000000000801553 in handler::ha_external_lock (this=0x2aaab0a0dff0, thd=0x2aaac4b55bb8, lock_type=2) at handler.cc:5298
#8  0x00000000006b1580 in unlock_external (thd=0x2aaac4b55bb8, table=0x2aaab09932f8, count=1) at lock.cc:821
#9  0x00000000006b189e in mysql_unlock_tables (thd=0x2aaac4b55bb8, sql_lock=0x2aaab09932b8) at lock.cc:436
#10 0x000000000071d59b in close_thread_tables (thd=0x2aaac4b55bb8, skip_mdl=false) at sql_base.cc:1425
#11 0x0000000000899ede in sp_lex_keeper::reset_lex_and_exec_core (this=0x5e25100, thd=0x2aaac4b55bb8, nextp=0x4d19d568, open_tables=false, instr=0x5e250c0)
    at sp_head.cc:2742
#12 0x000000000089a48b in sp_instr_stmt::execute (this=0x5e250c0, thd=0x2aaac4b55bb8, nextp=0x4d19d568) at sp_head.cc:2847
#13 0x000000000089c5db in sp_head::execute (this=0x55825b0, thd=0x2aaac4b55bb8) at sp_head.cc:1241
#14 0x000000000089d374 in sp_head::execute_procedure (this=0x55825b0, thd=0x2aaac4b55bb8, args=0x2aaac4b58090) at sp_head.cc:1972
#15 0x00000000006d2bc9 in mysql_execute_command (thd=0x2aaac4b55bb8) at sql_parse.cc:4131
#16 0x00000000006d446b in mysql_parse (thd=0x2aaac4b55bb8, inBuf=0x5862ca0 "CALL test.insdel2()", length=19, found_semicolon=0x4d19ef20) at sql_parse.cc:5634
#17 0x00000000006d564d in dispatch_command (command=COM_QUERY, thd=0x2aaac4b55bb8, packet=0x2aaac48e26f9 "CALL test.insdel2()", packet_length=19)
    at sql_parse.cc:1009
#18 0x00000000006d6b94 in do_command (thd=0x2aaac4b55bb8) at sql_parse.cc:689
#19 0x00000000006c5293 in handle_one_connection (arg=0x2aaac4b55bb8) at sql_connect.cc:1156
#20 0x0000003ba88062f7 in start_thread () from /lib64/libpthread.so.0
#21 0x0000003ba80ce85d in clone () from /lib64/libc.so.6

How to repeat:
If this is repeatable, a test case will be provided.
[29 Oct 2008 10:35] Philip Stoev
Thread stacks for bug 40391

Attachment: bug40391.threads.txt (text/plain), 252.58 KiB.

[29 Oct 2008 11:02] Philip Stoev
Another occurence of the same bug.

Attachment: bug40391-2.threads.txt (text/plain), 262.46 KiB.

[5 Nov 2008 21:17] Michael Widenius
Fixed in MySQL-maria tree by removing the mutex in trnman_get_min_trid() that caused the hangup.

This is pushed into MySQL-maria tree and will be soon merged to MySQL-6.0
[15 Dec 2008 10:08] Bugs System
Pushed into 6.0.9-alpha  (revid:monty@mysql.com-20081105211709-brhm8zw7jbv0i04g) (version source revid:guilhem@mysql.com-20081213204800-0nubni3t4ihn4hv9) (pib:5)
[8 Jan 2009 10:39] MC Brown
Test case failure. No documentation needed.