Bug #58191 | Crash in lock_remove_all_on_table at /lock/lock0lock.c:3666 | ||
---|---|---|---|
Submitted: | 15 Nov 2010 5:04 | Modified: | 30 Nov 2010 8:39 |
Reporter: | Roel Van de Paar | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 5.5.7rc | OS: | Any |
Assigned to: | Assigned Account | CPU Architecture: | Any |
[15 Nov 2010 5:04]
Roel Van de Paar
[15 Nov 2010 5:41]
Roel Van de Paar
I can reproduce the crash, but the test setup is complex; 9 threads running different RQG tests.
[15 Nov 2010 5:42]
Roel Van de Paar
gdb bt full from crash 1
Attachment: gdb_crash1.txt (text/plain), 99.17 KiB.
[15 Nov 2010 5:43]
Roel Van de Paar
gdb bt full from crash 2
Attachment: gdb_crash2.txt (text/plain), 107.80 KiB.
[15 Nov 2010 5:44]
Roel Van de Paar
gdb bt full from crash 3
Attachment: gdb_crash3.txt (text/plain), 107.19 KiB.
[15 Nov 2010 6:38]
Roel Van de Paar
See bug #52590 | bug #46650
[30 Nov 2010 7:53]
Jimmy Yang
I created a simple repro as following. And it does not seems to be uncommon, except this happens on partition table: 1) create table D ( `pk` INTEGER NOT NULL , col_int_key INTEGER ) engine = innodb; 2) insert into D values (2, 3); insert into D values (3, 3); insert into D values (4, 3); insert into D values (4, 3); <== Note this is a dup 3) CREATE TABLE IF NOT EXISTS t13 ( `pk` INTEGER NOT NULL AUTO_INCREMENT , `col_int` INTEGER , PRIMARY KEY ( `pk` ) ) engine = innodb PARTITION BY HASH ( `pk` ) PARTITIONS 5 SELECT `pk` , `col_int_key` FROM D; The create table will be aborted as we have a dup key (4) when select it into t13 with pk the primary key. The failed assertion asserts that we should release lock in reverse order as we insert them. This is not true if locks belong to different table/partitions: During initial insertion, we push the AUTOINC locks in following stack: row_lock_table_autoinc_for_mysql ha_innobase::innobase_lock_autoinc ha_innobase::innobase_set_max_autoinc ha_innobase::write_row 3596 if (type_mode == LOCK_AUTO_INC) { 3597 3598 lock = table->autoinc_lock; 3599 3600 table->autoinc_trx = trx; 3601 3602 ib_vector_push(trx->autoinc_locks, lock); <== 3603 } else { 3604 lock = mem_heap_alloc(trx->lock_heap, sizeof(lock_t)); 3605 } 3606 3607 UT_LIST_ADD_LAST(trx_locks, trx->trx_locks, lock); <== 3 locks inserted as following (each for a partition): (gdb) p lock $38 = (ib_lock_t *) 0x9031ac8 (gdb) p lock $42 = (ib_lock_t *) 0x90323f0 (gdb) p lock $44 = (ib_lock_t *) 0x9032da0 When we rollback, we free the lock in lock_remove_all_on_table_for_trx(): 4116 lock = UT_LIST_GET_LAST(trx->trx_locks); 4117 4118 while (lock != NULL) { 4119 prev_lock = UT_LIST_GET_PREV(trx_locks, lock); ... 4133 lock_table_remove_low(lock); 4134 } 4135 4136 lock = prev_lock; 4137 } 4138 } It looks like the lock is freed in reverse order, however, please note we free the lock for this particular partition (table): (gdb) p lock->un_member.tab_lock.table $82 = (dict_table_t *) 0x90329f8 (gdb) p table $83 = (dict_table_t *) 0x9033208 (gdb) p lock->un_member.tab_lock.table->name $84 = 0x9032b50 "test/t15#P#p4" (gdb) p table->name $85 = 0x9033360 "test/t15#P#p2" so even though the last lock in trx->autoinc_locks is (ib_lock_t *) 0x9032da0, it belongs to test/t15#P#p4, and we are now freeing locks for table, "test/t15#P#p2". So instead of freeing the last lock in trx->autoinc_locks, we are freeing the first lock in the queue. Thus following assertion in lock_table_remove_low() no longer holds: lock_table_remove_low() { ... 3652 /* The locks must be freed in the reverse order from 3653 the one in which they were acquired. This is to avoid 3654 traversing the AUTOINC lock vector unnecessarily. 3655 3656 We only store locks that were granted in the 3657 trx->autoinc_locks vector (see lock_table_create() 3658 and lock_grant()). Therefore it can be empty and we 3659 need to check for that. */ 3660 3661 if (!lock_get_wait(lock) 3662 && !ib_vector_is_empty(trx->autoinc_locks)) { 3663 lock_t* autoinc_lock; 3664 3665 autoinc_lock = ib_vector_pop(trx->autoinc_locks); 3666 ut_a(autoinc_lock == lock); <<=== 3667 } ...} And it triggers the assertion described in this bug: 101129 23:49:29 InnoDB: Assertion failure in thread 2999352176 in file /home/jy/work/mysql5.5_7/mysql-trunk-innodb/storage/innobase/lock/lock0lock.c line 3666 InnoDB: Failing assertion: autoinc_lock == lock InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be
[30 Nov 2010 8:38]
Jimmy Yang
Confirmed this is dup of #56228, who would handle out of order autoinc lock free. And the test run fine with correct error reported without stacktrace (the original assertion for order is of course removed in 56228) mysql> CREATE TABLE IF NOT EXISTS t16 ( `pk` INTEGER NOT NULL AUTO_INCREMENT , `col_int` INTEGER , PRIMARY KEY ( `pk` ) ) engine = innodb PARTITION BY HASH ( `pk` ) PARTITIONS 5 SELECT `pk` , `col_int_key` FROM D; ERROR 1062 (23000): Duplicate entry '4' for key 'PRIMARY'