| Bug #58191 | Crash in lock_remove_all_on_table at /lock/lock0lock.c:3666 | ||
|---|---|---|---|
| Submitted: | 15 Nov 2010 5:04 | Modified: | 30 Nov 2010 8:39 |
| Reporter: | Roel Van de Paar | Email Updates: | |
| Status: | Duplicate | Impact on me: | |
| Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
| Version: | 5.5.7rc | OS: | Any |
| Assigned to: | Assigned Account | CPU Architecture: | Any |
[15 Nov 2010 5:04]
Roel Van de Paar
[15 Nov 2010 5:41]
Roel Van de Paar
I can reproduce the crash, but the test setup is complex; 9 threads running different RQG tests.
[15 Nov 2010 5:42]
Roel Van de Paar
gdb bt full from crash 1
Attachment: gdb_crash1.txt (text/plain), 99.17 KiB.
[15 Nov 2010 5:43]
Roel Van de Paar
gdb bt full from crash 2
Attachment: gdb_crash2.txt (text/plain), 107.80 KiB.
[15 Nov 2010 5:44]
Roel Van de Paar
gdb bt full from crash 3
Attachment: gdb_crash3.txt (text/plain), 107.19 KiB.
[15 Nov 2010 6:38]
Roel Van de Paar
See bug #52590 | bug #46650
[30 Nov 2010 7:53]
Jimmy Yang
I created a simple repro as following. And it does not seems to be uncommon, except this happens on partition table:
1) create table D ( `pk` INTEGER NOT NULL , col_int_key INTEGER ) engine = innodb;
2) insert into D values (2, 3);
insert into D values (3, 3);
insert into D values (4, 3);
insert into D values (4, 3); <== Note this is a dup
3) CREATE TABLE IF NOT EXISTS t13 ( `pk` INTEGER NOT NULL AUTO_INCREMENT , `col_int` INTEGER , PRIMARY KEY ( `pk` ) ) engine = innodb PARTITION BY HASH ( `pk` ) PARTITIONS 5 SELECT `pk` , `col_int_key` FROM D;
The create table will be aborted as we have a dup key (4) when select it into t13 with pk the primary key.
The failed assertion asserts that we should release lock in reverse order as we insert them. This is not true if locks belong to different table/partitions:
During initial insertion, we push the AUTOINC locks in following stack:
row_lock_table_autoinc_for_mysql
ha_innobase::innobase_lock_autoinc
ha_innobase::innobase_set_max_autoinc
ha_innobase::write_row
3596 if (type_mode == LOCK_AUTO_INC) {
3597
3598 lock = table->autoinc_lock;
3599
3600 table->autoinc_trx = trx;
3601
3602 ib_vector_push(trx->autoinc_locks, lock); <==
3603 } else {
3604 lock = mem_heap_alloc(trx->lock_heap, sizeof(lock_t));
3605 }
3606
3607 UT_LIST_ADD_LAST(trx_locks, trx->trx_locks, lock); <==
3 locks inserted as following (each for a partition):
(gdb) p lock
$38 = (ib_lock_t *) 0x9031ac8
(gdb) p lock
$42 = (ib_lock_t *) 0x90323f0
(gdb) p lock
$44 = (ib_lock_t *) 0x9032da0
When we rollback, we free the lock in lock_remove_all_on_table_for_trx():
4116 lock = UT_LIST_GET_LAST(trx->trx_locks);
4117
4118 while (lock != NULL) {
4119 prev_lock = UT_LIST_GET_PREV(trx_locks, lock);
...
4133 lock_table_remove_low(lock);
4134 }
4135
4136 lock = prev_lock;
4137 }
4138 }
It looks like the lock is freed in reverse order, however, please note we free the lock for this particular partition (table):
(gdb) p lock->un_member.tab_lock.table
$82 = (dict_table_t *) 0x90329f8
(gdb) p table
$83 = (dict_table_t *) 0x9033208
(gdb) p lock->un_member.tab_lock.table->name
$84 = 0x9032b50 "test/t15#P#p4"
(gdb) p table->name
$85 = 0x9033360 "test/t15#P#p2"
so even though the last lock in trx->autoinc_locks is (ib_lock_t *) 0x9032da0, it belongs to test/t15#P#p4, and we are now freeing locks for table, "test/t15#P#p2". So instead of freeing the last lock in trx->autoinc_locks, we are freeing the first lock in the queue.
Thus following assertion in lock_table_remove_low() no longer holds:
lock_table_remove_low()
{
...
3652 /* The locks must be freed in the reverse order from
3653 the one in which they were acquired. This is to avoid
3654 traversing the AUTOINC lock vector unnecessarily.
3655
3656 We only store locks that were granted in the
3657 trx->autoinc_locks vector (see lock_table_create()
3658 and lock_grant()). Therefore it can be empty and we
3659 need to check for that. */
3660
3661 if (!lock_get_wait(lock)
3662 && !ib_vector_is_empty(trx->autoinc_locks)) {
3663 lock_t* autoinc_lock;
3664
3665 autoinc_lock = ib_vector_pop(trx->autoinc_locks);
3666 ut_a(autoinc_lock == lock); <<===
3667 }
...}
And it triggers the assertion described in this bug:
101129 23:49:29 InnoDB: Assertion failure in thread 2999352176 in file /home/jy/work/mysql5.5_7/mysql-trunk-innodb/storage/innobase/lock/lock0lock.c line 3666
InnoDB: Failing assertion: autoinc_lock == lock
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
[30 Nov 2010 8:38]
Jimmy Yang
Confirmed this is dup of #56228, who would handle out of order autoinc lock free. And the test run fine with correct error reported without stacktrace (the original assertion for order is of course removed in 56228) mysql> CREATE TABLE IF NOT EXISTS t16 ( `pk` INTEGER NOT NULL AUTO_INCREMENT , `col_int` INTEGER , PRIMARY KEY ( `pk` ) ) engine = innodb PARTITION BY HASH ( `pk` ) PARTITIONS 5 SELECT `pk` , `col_int_key` FROM D; ERROR 1062 (23000): Duplicate entry '4' for key 'PRIMARY'
