Bug #59354 Assert !other_lock at lock_rec_add_to_queue during a delete operation
Submitted: 7 Jan 2011 15:07 Modified: 29 Jan 2013 19:33
Reporter: Sandeep Doddaballapur Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.5,mysql-trunk OS:Linux
Assigned to: Sunny Bains CPU Architecture:Any
Tags: rqg_pb2

[7 Jan 2011 15:07] Sandeep Doddaballapur
Server Crash found with rqg tests on pb2 running on build 5.6.2-m5 

Test name : rqg_innodb_stress
Build : mysql-5.6.2-m5-linux-i686

Following query reported to cause crash from logs :
DELETE FROM B WHERE `col_varchar_key` < '2007-07-01' LIMIT 6;

Following is the backtrace of the crash 
#0  0x00905410 in __kernel_vsyscall ()
#1  0x00819047 in pthread_kill () from /lib/libpthread.so.0
#2  0x085181fe in my_write_core (sig=6) at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/mysys/stacktrace.c:423
#3  0x08166335 in handle_segfault (sig=6) at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/mysqld.cc:2511
#4  <signal handler called>
#5  0x00905410 in __kernel_vsyscall ()
#6  0x006c6c10 in raise () from /lib/libc.so.6
#7  0x006c8521 in abort () from /lib/libc.so.6
#8  0x08686443 in lock_rec_add_to_queue (type_mode=1059, block=0xaef42a00, heap_no=221, index=0xaa050f0, trx=0xab68478, caller_owns_trx_mutex=0)
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/storage/innobase/lock/lock0lock.c:1887
#9  0x0868fe76 in lock_rec_convert_impl_to_expl (block=0xaef42a00, rec=0xaf5a4be9 "2\200", index=0xaa050f0, offsets=0xa8966224)
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/storage/innobase/lock/lock0lock.c:5436
#10 0x086906e4 in lock_sec_rec_read_check_and_lock (flags=0, block=0xaef42a00, rec=0xaf5a4be9 "2\200", index=0xaa050f0, offsets=0xa8966224, mode=LOCK_X, ga
    thr=0xa5129730) at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/storage/innobase/lock/lock0lock.c:5638
#11 0x085775f0 in sel_set_rec_lock (block=0xaef42a00, rec=0xaf5a4be9 "2\200", index=0xaa050f0, offsets=0xa8966224, mode=3, type=0, thr=0xa5129730)
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/storage/innobase/row/row0sel.c:997
#12 0x0857c48f in row_search_for_mysql (buf=0xaa8edc8 "<FF><FF>", mode=1, prebuilt=0xaa8e890, match_mode=0, direction=0)
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/storage/innobase/row/row0sel.c:4310
#13 0x08546a8b in ha_innobase::index_read (this=0xaa8ec08, buf=0xaa8edc8 "<FF><FF>", key_ptr=0xaaae240 "\001", key_len=4, find_flag=HA_READ_AFTER_KEY)
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/storage/innobase/handler/ha_innodb.cc:6113
#14 0x0835bb3c in handler::index_read_map (this=0xaa8ec08, buf=0xaa8edc8 "<FF><FF>", key=0xaaae240 "\001", keypart_map=1, find_flag=HA_READ_AFTER_KEY)
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/handler.h:1806
#15 0x08350fa1 in handler::ha_index_read_map (this=0xaa8ec08, buf=0xaa8edc8 "<FF><FF>", key=0xaaae240 "\001", keypart_map=1, find_flag=HA_READ_AFTER_KEY)
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/handler.cc:2251
#16 0x0835349f in handler::read_range_first (this=0xaa8ec08, start_key=0xaa8eca8, end_key=0xaa8ecb8, eq_range_arg=false, sorted=true)
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/handler.cc:5350
#17 0x083504e7 in handler::multi_range_read_next (this=0xaa8ec08, range_info=0xa8966aec) at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5
#18 0x08354587 in DsMrr_impl::dsmrr_next (this=0xaa8eda4, range_info=0xa8966aec) at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/han
#19 0x08538f7c in ha_innobase::multi_range_read_next (this=0xaa8ec08, range_info=0xa8966aec)
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/storage/innobase/handler/ha_innodb.cc:12526
#20 0x08417f07 in QUICK_RANGE_SELECT::get_next (this=0xaaa6d90) at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/opt_range.cc:8908
#21 0x084333cb in rr_quick (info=0xa8966f68) at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/records.cc:347
#22 0x08459f1a in mysql_delete (thd=0xaa98840, table_list=0xab26a80, conds=0xab26fa8, order_list=0xaa9a070, limit=6, options=0)
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/sql_delete.cc:299
#23 0x081f112f in mysql_execute_command (thd=0xaa98840) at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/sql_parse.cc:2947
#24 0x081f62c2 in mysql_parse (thd=0xaa98840, rawbuf=0xab26990 "DELETE FROM B WHERE `col_varchar_key` < '2007-07-01' LIMIT 6", length=60, parser_state=0xa8
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/sql_parse.cc:5550
#25 0x081f7729 in dispatch_command (command=COM_QUERY, thd=0xaa98840, packet=0xab32979 "DELETE FROM B WHERE `col_varchar_key` < '2007-07-01' LIMIT 6", pack
    at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/sql_parse.cc:1078
#26 0x081f8ad9 in do_command (thd=0xaa98840) at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/sql_parse.cc:815
#27 0x082c38d0 in do_handle_one_connection (thd_arg=0xaa98840) at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/sql_connect.cc:748
#28 0x082c39c5 in handle_one_connection (arg=0xaa98840) at /export/home/pb2/build/sb_0-2744438-1294315679.8/mysql-5.6.2-m5/sql/sql_connect.cc:684
#29 0x0081443b in start_thread () from /lib/libpthread.so.0
#30 0x0076bfde in clone () from /lib/libc.so.6

The RQG options from the pushbuild 

runall-new.pl \ 
--mysqld=--loose-skip-safemalloc \ 
--grammar=conf/engines/maria/maria_stress.yy \ 
--reporters=Deadlock,ErrorLog,Backtrace \ 
--duration=600 \ 
--basedir=/export/home/pb2/test/sb_1-2744546-1294319376.11/mysql-5.6.2-m5-linux-i686-test \ 
--vardir=/export/home/pb2/test/sb_1-2744546-1294319376.11/mysql-5.6.2-m5-linux-i686-test/vardirs \ 
--mysqld=--log-output=file \ 
--mysqld=--loose-table-lock-wait-timeout=1 \ 
--mysqld=--loose-lock-wait-timeout=1 \ 
--mysqld=--loose-innodb-lock-wait-timeout=1 \ 
--queries=100000 \ 
--engine=innodb \ 
--mysqld=--innodb \ 
--testname=rqg_innodb_stress \ 
--xml-output=/export/home/pb2/test/sb_1-2744546-1294319376.11/tmp/rqg_innodb_stress.xml \ 
--report-xml-tt \ 
--report-xml-tt-type=scp \ 

How to repeat:
Currently working on the reproducible test case.Will attach it soon.
[8 Jan 2011 10:58] Jon Olav Hauglid
Looks like this belongs to InnoDB.
[11 Jan 2011 22:21] Omer Barnir
triage: Is this a 'sporadic' crash or one that is always observed?
[12 Jan 2011 15:00] Sandeep Doddaballapur
The rqg_innodb_stress test failed on mysql-trunk. Also Currently the tests are not failing , the last failed rqg_inndob_stress test can be observed at below pushbuild link for a push on date 2011-01-06 12:44:03 :

[17 Feb 2011 11:55] Marko Mäkelä
When lock_rec_convert_impl_to_expl() is being invoked in lock_sec_rec_read_check_and_lock(), we are holding neither trx_sys->lock nor lock_sys->mutex. This looks a little suspicious and might explain why this bug was never observed before the kernel_mutex split.

	if ((page_get_max_trx_id(block->frame) >= trx_list_get_min_trx_id()
	     || recv_recovery_is_on())
	    && !page_rec_is_supremum(rec)) {

		lock_rec_convert_impl_to_expl(block, rec, index, offsets);

The old code holds kernel_mutex here.
[17 Feb 2011 12:21] Sunny Bains
trx_list_get_min_trx_id() can't be empty, that is guaranteed because of the active transaction that is calling the function. Therefore it should never return max_trx_id in this function. Transactions are always added to the head. Min is read from the tail.

All that the kernel mutex protected was the addition of new transactions. I can't see how that could be

Anything else?
[17 Feb 2011 12:37] Marko Mäkelä
I am not sure that this is a bug caused by the kernel_mutex split or other improvements in trunk, but that is a working hypothesis. Why have we not seen this in 5.5 or earlier versions?

To increase test coverage, we could enable this particular lock_rec_other_has_expl_req() check in non-debug builds, in both 5.5 and trunk:

=== modified file 'storage/innobase/lock/lock0lock.c'
--- storage/innobase/lock/lock0lock.c	revid:marko.makela@oracle.com-20110217075139-1b5fzlivmjrxrjfx
+++ storage/innobase/lock/lock0lock.c	2011-02-17 12:26:28 +0000
@@ -1485,7 +1485,7 @@ lock_rec_has_expl(
-#ifdef UNIV_DEBUG
+#if 1/*def UNIV_DEBUG*/
 Checks if some other transaction has a lock request in the queue.
 @return	lock or NULL */
@@ -1924,7 +1924,7 @@ lock_rec_add_to_queue(
 	ut_ad(caller_owns_trx_mutex == trx_mutex_own(trx));
-#ifdef UNIV_DEBUG
+#if 1/*def UNIV_DEBUG*/
 	switch (type_mode & LOCK_MODE_MASK) {
 	case LOCK_X:
 	case LOCK_S:
[10 Mar 2011 14:23] Mikhail Izioumtchenko
the answer to whether it needs compression or file per table is I don't think
so, based on the fact that it was originally reproduced with runall-new.pl
which presumably would use the default for file per table. As for compression,
unless the grammar uses KEY_BLOCK_SIZE there's no compression.
The table status I posted shows COMPACT format.
Note how the failing statements are similar:

DELETE FROM X WHERE key_or_key_part_column < date_or_time_constant LIMIT N

maybe if you could figure out all threads working on the table with their SQL,
a small testcase could be constructed.
[30 Apr 2012 10:55] MySQL Verification Team
testcase. crashes 5.5 and 5.6 within seconds/minutes.

Attachment: bug59354.c (text/plain), 14.32 KiB.

[30 Apr 2012 11:10] MySQL Verification Team
Not a recent regression.  Affects debug builds. Testing some minor versions, results:

5.6.6         : crashes
5.5.24        : crashes
5.5.23        : crashes
5.5.20        : crashes
5.5.10        : crashes
5.5.1         : crashes
[30 Apr 2012 11:25] MySQL Verification Team
To answer some previous questions.

1.  5.1.62 + plugin doesn't crash after > 1 million queries.
2.  5.5.23 crashes with compact (innodb_file_per_table=0) and compressed tables.

the testcase uses very small tables and hit thousands of deadlocks per minute.
[10 Sep 2012 14:22] Marko Mäkelä
MySQL 5.1 does not contain this assertion in the built-in InnoDB. InnoDB Plugin contains it, but it must be built with UNIV_DEBUG defined (it is not defined by the normal build scripts, not even for debug builds).

I do not see any reason why this bug would be a recent regression.
[10 Sep 2012 14:36] MySQL Verification Team
the above .c testcase hits 5.7.0 also.

Version: '5.7.0-m10-debug'  
InnoDB: Assertion failure in thread 3828 in file lock0lock.cc line 1986
InnoDB: Failing assertion: !other_lock
[10 Sep 2012 14:37] MySQL Verification Team
run testcase with mysqld options:
--innodb-flush-log-at-trx-commit=0 --innodb-lock-wait-timeout=1 
--lock-wait-timeout=1 --innodb-doublewrite=0 --innodb-support-xa=0
[29 Jan 2013 7:32] Roel Van de Paar
mysqld error log

Attachment: master_MS_bug_59354.err (application/octet-stream, text), 42.45 KiB.

[29 Jan 2013 7:33] Roel Van de Paar
thread ally all bt on core

Attachment: gdb_MS_bug_59354.txt (text/plain), 126.31 KiB.

[29 Jan 2013 7:33] Roel Van de Paar
Issue seen in Percona server as well. Thread apply all bt + error log attached above.
[29 Jan 2013 19:33] John Russell
Error condition was located through low-level debugging, so don't have user-level details about how to reproduce.
[1 Feb 2013 6:45] MySQL Verification Team
this is easy to repeat, but it's already fixed in unreleased versions.
Pushed into mysql-5.5 5.5.30 
Pushed into mysql-5.6 5.6.10 
Pushed into mysql-trunk 5.7.1-m11