Bug #57345 | btr_pcur_store_position abort for load with concurrent lock/unlock tables | ||
---|---|---|---|
Submitted: | 8 Oct 2010 20:42 | Modified: | 14 Dec 2010 19:25 |
Reporter: | Victor Kirkebo | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S3 (Non-critical) |
Version: | 5.1.52 | OS: | Any |
Assigned to: | Jimmy Yang | CPU Architecture: | Any |
[8 Oct 2010 20:42]
Victor Kirkebo
[8 Oct 2010 20:45]
Victor Kirkebo
script for reproducing the bug
Attachment: bug_test.tar.gz (application/x-gzip, text), 1009 bytes.
[8 Oct 2010 20:47]
Victor Kirkebo
I tested 5.1.51. I did not see the bug for that version.
[9 Oct 2010 5:02]
Sveta Smirnova
Thank you for the report. I can not repeat described behavior with current BZR sources. How long should I run the test?
[10 Oct 2010 11:08]
Victor Kirkebo
I am also able to reproduce this bug with a very simple insert/delete load in concurrency with a lock tables <tblname> write;/unlock tables; load. It also seems to be necessary to use "set autocommit=0" before the lock tables command. I will upload another test script shortly. Unlike the previous test script (for alter table load) this one needs to run with more threads (10 instead of 2) and also the run_load.txt file contains many more entries for the lock tables/unlock table test (10 instead of just 1 before). The two test types that are running concurrently are doing this: test1: ------ use tst; set autocommit=0; lock tables t1 write; unlock tables; test2: ------ use tst; insert into t1(f1) values('a'); delete from t1;
[10 Oct 2010 11:10]
Victor Kirkebo
This test is using an insert/delete load while the table is being locked/unlocked
Attachment: bug57345_test.tar.gz (application/x-gzip, text), 1016 bytes.
[10 Oct 2010 11:13]
Victor Kirkebo
I am changing the synopsis of the bug report to better match the issue at hand.
[10 Oct 2010 11:22]
Victor Kirkebo
How to repeat: 1) Download the attached 'bug57345_test.tar.gz' file and untar it in the mysql-test directory 2) If running in a source tree, edit the run_wait' script and change the value of CLIENT_PATH=$CUR_PATH/bin to CLIENT_PATH=$CUR_PATH/client 3) Run the ./run_bug_test script 4) This should reproduce the bug within a short time; often within a few seconds. However, the test will run for up to 5 minutes if nothing goes wrong - in which case you might consider adding more threads for the load. Do this by editing the line containing "--threads=10" in the ./run_bug_test script.
[11 Oct 2010 6:40]
Jimmy Yang
This does seem to related to bug #56716 fix. When the fix moves lock table code up, it is possible we call "goto lock_wait_or_error" without pcur being positioned and hit assertion failure in btr_pcur_store_position(): if (!prebuilt->sql_stat_start) { .... } else if (prebuilt->select_lock_type == LOCK_NONE) { ... } else { err = lock_table(0, index->table, prebuilt->select_lock_type == LOCK_S ? LOCK_IS : LOCK_IX, thr); if (err != DB_SUCCESS) { goto lock_wait_or_error; <==== } prebuilt->sql_stat_start = FALSE; } } lock_wait_or_error: .. btr_pcur_store_position(); <==== Assertion failure in btr_pcur_store_position(): ut_a(cursor->pos_state == BTR_PCUR_IS_POSITIONED); It is confirmed by following debugger session: Put a break point at above "goto lock_wait_or_error", and run the repro. It immediately hit this breakpoint. And as described, then triggered the assertion failure: Breakpoint 2, row_search_for_mysql (buf=0x8d5dd00 "\377", mode=1, prebuilt=0xb75b0068, match_mode=0, direction=0) at row/row0sel.c:3631 3631 goto lock_wait_or_error; Current language: auto The current source language is "auto; currently c". (gdb) where #0 row_search_for_mysql (buf=0x8d5dd00 "\377", mode=1, prebuilt=0xb75b0068, match_mode=0, direction=0) at row/row0sel.c:3631 #1 0x084539f6 in ha_innobase::index_read (this=0x8d5dbb0, buf=0x8d5dd00 "\377", key_ptr=0x0, key_len=0, find_flag=HA_READ_AFTER_KEY) at handler/ha_innodb.cc:4725 #2 0x0845425c in ha_innobase::index_first (this=0x8d5dbb0, buf=0x8d5dd00 "\377") at handler/ha_innodb.cc:4989 #3 0x084544b7 in ha_innobase::rnd_next (this=0x8d5dbb0, buf=0x8d5dd00 "\377") at handler/ha_innodb.cc:5086 #4 0x0836c55b in rr_sequential (info=0xb3177ce4) at records.cc:385 #5 0x083988d6 in copy_data_between_tables (from=0x8d360c0, to=0x8d35610, create=..., ignore=false, order_num=0, order=0x0, copied=0xb31782d0, deleted=0xb31782c8, keys_onoff=LEAVE_AS_IS, error_if_not_empty=false) at sql_table.cc:7810 #6 0x08396919 in mysql_alter_table (thd=0x8d5bc18, new_db=0x8d6d708 "tst", new_name=0x8d6d508 "t1", create_info=0xb3179370, table_list=0x8d6d530, alter_info=0xb3179660, order_num=0, order=0x0, ignore=false) at sql_table.cc:7279 #7 0x08248740 in mysql_execute_command (thd=0x8d5bc18) at sql_parse.cc:2968 #8 0x082524b1 in mysql_parse (thd=0x8d5bc18, rawbuf=0x8d6d450 "ALTER TABLE t1 MODIFY f1 CHAR(255) CHARACTER SET latin1", length=55, found_semicolon=0xb317a05c) at sql_parse.cc:6051 #9 0x082442c6 in dispatch_command (command=COM_QUERY, thd=0x8d5bc18, (gdb) p pcur->pos_state $1 = 2 (gdb) n 4400 if (UNIV_UNLIKELY(prebuilt->row_read_type (gdb) 4404 did_semi_consistent_read = FALSE; (gdb) 4408 btr_pcur_store_position(pcur, &mtr); (gdb) p pcur->pos_state $2 = 2 (gdb) step btr_pcur_store_position (cursor=0xb75adb68, mtr=0xb31776f0) at btr/btr0pcur.c:83 83 ut_a(cursor->pos_state == BTR_PCUR_IS_POSITIONED); (gdb) n 101011 9:35:17 InnoDB: Assertion failure in thread 3004672880 in file btr/btr0pcur.c line 83 InnoDB: Failing assertion: cursor->pos_state == BTR_PCUR_IS_POSITIONED InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to
[11 Oct 2010 7:30]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/120469 3624 Jimmy Yang 2010-10-11 Fix Bug #57345 btr_pcur_store_position abort for load with concurrent lock/unlock tables. Fix approved by Marko.
[11 Oct 2010 7:44]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/120471 3625 Jimmy Yang 2010-10-11 Fix Bug #57345 btr_pcur_store_position abort for load with concurrent lock/unlock tables Fix approved by Marko
[11 Oct 2010 8:04]
Marko Mäkelä
OK to push. Sorry, this was a bug planted by me in the Bug #56716 fix. The function row_search_for_mysql() is really too big to maintain reliably.
[11 Oct 2010 8:35]
Jimmy Yang
Fix checked in mysql-5.1-innodb built-in and zip. Ported the fix to mysql-5.5-innodb.
[11 Oct 2010 12:37]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/120499 3624 Jimmy Yang 2010-10-11 A more complete fix for bug #57345 btr_pcur_store_position abort for load with concurrent lock/unlock tables Approved by Marko
[11 Oct 2010 12:46]
Marko Mäkelä
I approve the complete fix.
[14 Oct 2010 22:04]
Omer Barnir
triage: setting to SR51MRU (just for values to be consistent)
[1 Nov 2010 19:00]
Bugs System
Pushed into mysql-5.1 5.1.53 (revid:build@mysql.com-20101101184443-o2olipi8vkaxzsqk) (version source revid:build@mysql.com-20101101184443-o2olipi8vkaxzsqk) (merge vers: 5.1.53) (pib:21)
[1 Nov 2010 21:01]
John Russell
Added to change log: The server could crash with a high volume of concurrent LOCK TABLE and UNLOCK TABLES statements.
[9 Nov 2010 19:44]
Bugs System
Pushed into mysql-5.5 5.5.7-rc (revid:sunanda.menon@sun.com-20101109182959-otkxq8vo2dcd13la) (version source revid:sunanda.menon@sun.com-20101109182959-otkxq8vo2dcd13la) (merge vers: 5.5.7-rc) (pib:21)
[13 Nov 2010 16:23]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113152450-2zzcm50e7i4j35v7) (merge vers: 5.6.1-m4) (pib:21)
[13 Nov 2010 16:31]
Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:alexander.nozdrin@oracle.com-20101113152540-gxro4g0v29l27f5x) (pib:21)
[18 Nov 2010 15:54]
Bugs System
Pushed into mysql-5.1 5.1.54 (revid:build@mysql.com-20101118153531-693taxtxyxpt037i) (version source revid:build@mysql.com-20101118153531-693taxtxyxpt037i) (merge vers: 5.1.54) (pib:21)
[5 Dec 2010 12:38]
Bugs System
Pushed into mysql-trunk 5.6.1 (revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (version source revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (merge vers: 5.6.1) (pib:23)
[16 Dec 2010 22:27]
Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)