Bug #53541 assert que0que.ic:36 thr, possibly derefr an unused slot in srv_lock_check_wait
Submitted: 10 May 2010 16:07 Modified: 24 Jun 2010 21:09
Reporter: Mikhail Izioumtchenko Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB Plugin storage engine Severity:S3 (Non-critical)
Version:1.1.1 OS:Any
Assigned to: Sunny Bains CPU Architecture:Any

[10 May 2010 16:07] Mikhail Izioumtchenko
Description:
in a stress test with UNIV_DEBUG:

100507 15:29:24  InnoDB: Assertion failure in thread 1202616640 in file include/que0que.ic line 36
InnoDB: Failing assertion: thr
100507 15:29:24 - mysqld got signal 6 ;

 Thread 1 (process 30289):
 #0  0x0000003c1ae0b9b2 in pthread_kill () from /lib64/libpthread.so.0
 #1  0x00000000009f3cab in my_write_core (sig=6)
     at /spare2/mizioumt/ctc/mysql_src_c55/mysys/stacktrace.c:326
 #2  0x0000000000531b55 in handle_segfault (sig=6)
     at /spare2/mizioumt/ctc/mysql_src_c55/sql/mysqld.cc:2786
 #3  <signal handler called>
 #4  0x0000003c1a230265 in raise () from /lib64/libc.so.6
 #5  0x0000003c1a231d10 in abort () from /lib64/libc.so.6
 #6  0x000000000094eb25 in thr_get_trx (thr=0x0)
     at /spare2/mizioumt/ctc/mysql_src_c55/storage/innobase/include/que0que.ic:36
 #7  0x000000000087c611 in srv_lock_check_wait (slot=0x2aaaac2f20d8)
     at /spare2/mizioumt/ctc/mysql_src_c55/storage/innobase/srv/srv0srv.c:2325
 #8  0x000000000087c7fe in srv_lock_timeout_thread (arg=0x0)
     at /spare2/mizioumt/ctc/mysql_src_c55/storage/innobase/srv/srv0srv.c:2393
 #9  0x0000003c1ae064a7 in start_thread () from /lib64/libpthread.so.0
 #10 0x0000003c1a2d3c2d in clone () from /lib64/libc.so.6

(gdb) p *slot
$1 = {id = 1249900864, handle = 1249900864, type = 0, in_use = 0,
  suspended = 1, suspend_time = 1273271311, event = 0x2aab401c5950, thr = 0x0}

the code in srv_lock_check_wait() says:

/* It is possible that the thread has already
                        freed its slot and released its locks and another
                        thread is now using this slot. We need to
                        check whether the slot is still in use by the
                        same thread before cancelling the wait and releasing
                        the locks. */

                        mutex_enter(&kernel_mutex);

                        srv_sys_mutex_enter();

                        slot_trx = thr_get_trx(slot->thr);

I think it doesn't cover the case where the slot is freed but not reused

How to repeat:
stress test with UNIV_DEBUG, reproducibility is about 3/100

Suggested fix:
if slot->thr is NULL, declare the slot not used by the same thread.
[11 May 2010 3:08] Jimmy Yang
Ok to push.
[15 Jun 2010 8:11] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:marko.makela@oracle.com-20100511104500-c6kzd0bg5s42p8e9) (merge vers: 5.1.47) (pib:16)
[15 Jun 2010 8:27] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:marko.makela@oracle.com-20100511104500-c6kzd0bg5s42p8e9) (pib:16)
[24 Jun 2010 21:09] Sunny Bains
This entire change was reverted from the 5.5 tree.