MySQL Bugs: #43304: falcon_limit fails due to deadlock with blocked Maria threads

Bug #43304	falcon_limit fails due to deadlock with blocked Maria threads
Submitted:	2 Mar 2009 10:38	Modified:	2 Mar 2009 14:51
Reporter:	Olav Sandstå	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Server: Maria storage engine	Severity:	S3 (Non-critical)
Version:	6.0.10-alpha	OS:	Solaris
Assigned to:		CPU Architecture:	Any

Description:
The RQG test falcon_limit fails often when running it on Solaris 10 on SPARC. The error message is:

3 stalled queries detected, declaring deadlock at DSN dbi:mysql:host=127.0.0.1:port=19306:user=root:database=test.

This suspect that there are stalled threads either due to deadlock or spinning threads.

In many of the deadlock situation the only threads that I can see is actually locking something when the test ends are Maria threads. Examples of some excerpts from call stacks from one of these tests:

#0  0x00a2fa2c in _db_keyword_ (cs=0x179b1b50, keyword=0xd9cf08 "mutex", 
    strict=65538) at dbug.c:1829
#1  0x00a2e840 in _db_doprnt_ (format=0xd9cf18 "%s (0x%lx) locking")
    at dbug.c:1342
#2  0x00a23614 in safe_mutex_lock (mp=0x18bc060, my_flags=0, 
    file=0xd855e0 "ma_pagecache.c", line=2949) at thr_mutex.c:171
#3  0x008ff88c in pagecache_unlock_by_link (pagecache=0x18bbff8, 
    block=0x15692830, lock=PAGECACHE_LOCK_READ_UNLOCK, pin=PAGECACHE_UNPIN, 
    first_REDO_LSN_for_page=0, lsn=0, was_changed=1 '\001', any=0 '\0')
    at ma_pagecache.c:2949
#4  0x00934efc in _ma_unpin_all_pages (info=0x17e1f958, undo_lsn=0)
    at ma_key_recover.c:70
#5  0x009574a0 in write_block_record (info=0x17e1f958, old_record=0x0, 
    record=0x17c969a8 "\001�[\001", row=0x17e1f9a8, bitmap_blocks=0x17e1f9a8, 
    head_block_is_read=1 '\001', row_pos=0xfd62fff0, undo_lsn=1, 
    old_record_checksum=0) at ma_blockrec.c:3280
#6  0x00957a5c in allocate_and_write_block_record (info=0x17e1f958, 
    record=0x17c969a8 "\001�[\001", row=0x17e1f9a8, undo_lsn=1)
    at ma_blockrec.c:3408
#7  0x00957ca8 in _ma_write_init_block_record (info=0x17e1f958, 
    record=0x17c969a8 "\001�[\001") at ma_blockrec.c:3451
#8  0x0096c27c in maria_write (info=0x17e1f958, record=0x17c969a8 "\001�[\001")
    at ma_write.c:152
#9  0x009095c8 in ha_maria::write_row (this=0x17c96ac8, 
    buf=0x17c969a8 "\001�[\001") at ha_maria.cc:925
#10 0x00460ca8 in handler::ha_write_row (this=0x17c96ac8, 
    buf=0x17c969a8 "\001�[\001") at handler.cc:5500

---------------------------------------------------------------

#0  my_thread_var_dbug () at my_thr_init.c:433
#1  0x00a2b18c in code_state () at dbug.c:393
#2  0x00a2e1cc in _db_enter_ (_func_=0xd8b5b0 "_ma_mark_file_changed", 
    _file_=0xd8b480 "ma_locking.c", _line_=367, _stack_frame_=0xfd66f654)
    at dbug.c:1149
#3  0x0093ca08 in _ma_mark_file_changed (info=0x17dcd960) at ma_locking.c:367
#4  0x0096c140 in maria_write (info=0x17dcd960, record=0x177ce420 "�\001d")
    at ma_write.c:129
#5  0x009095c8 in ha_maria::write_row (this=0x177ce9d8, 
    buf=0x177ce420 "�\001d") at ha_maria.cc:925
#6  0x00460ca8 in handler::ha_write_row (this=0x177ce9d8, 
    buf=0x177ce420 "�\001d") at handler.cc:5500
#7  0x00361dd8 in create_internal_tmp_table_from_heap2 (thd=0x17d2b988, 
    table=0x17bbb978, start_recinfo=0x17bbc320, recinfo=0x17dcafe4, error=135, 
    ignore_last_dupp_key_error=true, hton=0x15560ec0, 
    proc_info=0xd029a0 "converting HEAP to Maria") at sql_select.cc:15559

-------------------------------------------------------------------------

#0  0xfef40094 in __lwp_park () from /lib/libc.so.1
#1  0xfef385ec in mutex_lock_queue () from /lib/libc.so.1
#2  0x00a237d0 in safe_mutex_lock (mp=0x18bc060, my_flags=0, 
    file=0xd855e0 "ma_pagecache.c", line=2949) at thr_mutex.c:226
#3  0x008ff88c in pagecache_unlock_by_link (pagecache=0x18bbff8, 
    block=0x156a55a0, lock=PAGECACHE_LOCK_WRITE_TO_READ, 
    pin=PAGECACHE_PIN_LEFT_PINNED, first_REDO_LSN_for_page=0, lsn=0, 
    was_changed=1 '\001', any=0 '\0') at ma_pagecache.c:2949
#4  0x009568ac in write_block_record (info=0x17fd7960, old_record=0x0, 
    record=0x17c0a1e0 "\001�y\001", row=0x17fd79b0, bitmap_blocks=0x17fd79b0, 
    head_block_is_read=1 '\001', row_pos=0xfd72fff0, undo_lsn=1, 
    old_record_checksum=0) at ma_blockrec.c:3038
#5  0x00957a5c in allocate_and_write_block_record (info=0x17fd7960, 
    record=0x17c0a1e0 "\001�y\001", row=0x17fd79b0, undo_lsn=1)
    at ma_blockrec.c:3408
#6  0x00957ca8 in _ma_write_init_block_record (info=0x17fd7960, 
    record=0x17c0a1e0 "\001�y\001") at ma_blockrec.c:3451
#7  0x0096c27c in maria_write (info=0x17fd7960, record=0x17c0a1e0 "\001�y\001")
    at ma_write.c:152
#8  0x009095c8 in ha_maria::write_row (this=0x17c0a300, 
    buf=0x17c0a1e0 "\001�y\001") at ha_maria.cc:925

How to repeat:
Run falcon_limit on Solaris on SPARC?

Suggested fix:
Queries should not be stalled for 4 minutes.

I looked at each thread in the posted stack trace link. There are indeed some threads which are inside Maria code, because of internally generated temporary tables (generated by GROUP BY or ORDER BY) are created of type Maria in 6.0:
for example, thread 32 builds an internal Maria table to execute this query:
SELECT * FROM E AS X LEFT JOIN E AS Y ON ( X . `varchar_key` = Y . `datetime_key` )  ORDER BY X . `date_key` LIMIT 8
But it's not clear if such thread is stalled.
Another example:
Thread 32 (process 2645055    ):
#0  0x00a2fa2c in _db_keyword_ (cs=0x179b1b50, keyword=0xd9cf08 "mutex", 
    strict=65538) at dbug.c:1829
#1  0x00a2e840 in _db_doprnt_ (format=0xd9cf18 "%s (0x%lx) locking")
    at dbug.c:1342
#2  0x00a23614 in safe_mutex_lock (mp=0x18bc060, my_flags=0, 
    file=0xd855e0 "ma_pagecache.c", line=2949) at thr_mutex.c:171
this is a DBUG_PRINT(), can never stall.
Note that many threads have their trace ending with:
"Backtrace stopped: previous frame identical to this frame (corrupt stack?)"
which is worrying.
I cannot see that the cause of the stall is in Maria. It could be elsewhere, for example look at this: there are 4 threads which are in:
copy_fields (param=0x17f94fb8) at sql_select.cc:20371
this place of code is a for() loop; this may be an infinite loop there.
I suggest two ways to debug this:
- change the Random Query Generator so that it tells the thread ids of the stalled threads
- build with --without-maria, and re-run; this will cause MyISAM to be used for internally generated tables (GROUP BY, ORDER BY) (for example: in 6.0-falcon, disable Maria in builds, see if it helps; if yes, contact the Maria team again; if not, re-enable Maria :)

Need feedback (more debug info or tests) from Falcon team

Guilhem,

Thanks for your feedback. I did not realise that --without-maria can be used to convert the temporary tables back to MyISAM. I have now requested from the build team that the Falcon RQG tests are run on a binary produced using --without-maria.

I will make the RQG dump better output for this "deadlock" situation.

After running this manually it appears that there is no bug, it is just slow due to bad Sparc T2 performance and big joins with no index. To alleviate the issue , I have done the following for PB2:

* Run the test with joining a small to a big table, rather than two big tables

* Start the test with --loose-maria-pagecache-buffer-size=64M

* Request that Maria is not compiled in at all for Falcon tests

It is likely though that Maria performance is lower than MyISAM's , however this is outside of the scope of this bug.