Bug #29644 | alter table hangs if records locked in share mode by long running transaction | ||
---|---|---|---|
Submitted: | 9 Jul 2007 10:39 | Modified: | 18 Jun 2010 1:01 |
Reporter: | Shane Bester (Platinum Quality Contributor) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S2 (Serious) |
Version: | >=5.0.24 | OS: | Windows (XP) |
Assigned to: | Georgi Kodinov | CPU Architecture: | Any |
Tags: | bfsm_2007_07_19 |
[9 Jul 2007 10:39]
Shane Bester
[9 Jul 2007 13:25]
Heikki Tuuri
Shane, what is the stack trace of the second SELECT query? My first guess is that the ALTER sets X-locks on the rows it reads from the table. The question is why it does this? Then the ALTER stops waiting for the SELECT to release its S-locks. I will check inside gdb what happens in an ALTER. --Heikki
[9 Jul 2007 13:34]
MySQL Verification Team
Heikki, I uploaded more info in private comment. I'll upload a .c file later that will reproduce this easily in 1 minute of running.
[9 Jul 2007 13:37]
MySQL Verification Team
Regarding the uploaded info, I have explicitly set innodb lock wait timeout to a much higher value than normal, in order to force the hangup to wait longer before the select times out.
[9 Jul 2007 13:48]
Heikki Tuuri
Shane, ok, the ALTER does set X-locks on the rows of read table! There is no need for that, S-locks would be enough. The SELECTs on the other hand are waiting for a MySQL table READ lock on the read table. Why do not they get it? Maybe the ALTER sets a MySQL WRITE lock on the read table, though a READ lock would be enough? --Heikki
[9 Jul 2007 17:11]
Heikki Tuuri
MySQL does take a WRITE lock on the read table in an ALTER: #0 ha_innobase::store_lock (this=0x1608320, thd=0x15c94b8, to=0x16089a0, lock_type=TL_WRITE_ALLOW_READ) at ha_innodb.cc:6681 #1 0x0000000000607414 in get_lock_data (thd=0x15c94b8, table_ptr=0x1609248, count=1, flags=2, write_lock_used=0x44086978) at lock.cc:720 #2 0x000000000060807a in mysql_lock_tables (thd=0x15c94b8, tables=0x1609248, count=1, flags=0, need_reopen=0x440869e7) at lock.cc:125 #3 0x000000000066506e in open_ltable (thd=0x15c94b8, table_list=0x16091b0, lock_type=TL_WRITE_ALLOW_READ) at sql_base.cc:2953 #4 0x0000000000750f77 in mysql_alter_table (thd=0x15c94b8, new_db=0x1609468 "test", new_name=0x0, create_info=0x44087be0, table_list=0x16091b0, alter_info=0x440882a0, order_num=0, order=0x0, ignore=false) at sql_table.cc:3138 #5 0x000000000062a714 in mysql_execute_command (thd=0x15c94b8) at sql_parse.cc:3256 #6 0x0000000000630d47 in mysql_parse (thd=0x15c94b8, inBuf=0x16090e8 "alter table t engine = innodb", length=29, found_semicolon=0x44088f10) at sql_parse.cc:6052 #7 0x00000000006336ae in dispatch_command (command=COM_QUERY, thd=0x15c94b8, packet=0x1601059 "alter table t engine = innodb", packet_length=30) at sql_parse.cc:1806 #8 0x0000000000634e3b in do_command (thd=0x15c94b8) at sql_parse.cc:1583 #9 0x00000000006361e1 in handle_one_connection (arg=0x15c94b8) at sql_parse.cc:1194 #10 0x00002ab31ff55f1a in start_thread () from /lib/libpthread.so.0 #11 0x00002ab3204fc602 in clone () from /lib/libc.so.6 #12 0x0000000000000000 in ?? () (gdb) (gdb) print *this->table->s $3 = {name_hash = {key_offset = 0, key_length = 0, records = 0, blength = 0, flags = 0, array = {buffer = 0x0, elements = 0, max_element = 0, alloc_increment = 0, size_of_element = 0}, get_key = 0, free = 0, charset = 0x0}, mem_root = {free = 0x0, used = 0x0, pre_alloc = 0x0, min_malloc = 0, block_size = 0, block_num = 0, first_block_usage = 0, error_handler = 0}, keynames = {count = 0, name = 0x0, type_names = 0x1608508, type_lengths = 0x0}, fieldnames = {count = 1, name = 0x0, type_names = 0x16084f8, type_lengths = 0x0}, intervals = 0x0, blob_field = 0x0, default_values = 0x16084c8 "ΓΏ", comment = { str = 0x16084e0 "", length = 0}, table_charset = 0xe4f080, table_cache_key = 0x16085b0 "test", db = 0x16085b0 "test", table_name = 0x16082f8 "t", path = 0x1608300 "./test/t", connect_string = { str = 0x1608318 "", length = 0}, keys_in_use = {map = 0}, keys_for_keyread = {map = 0}, avg_row_length = 0, raid_chunksize = 0, version = 1, flush_version = 1, mysql_version = 50044, timestamp_offset = 0, reclength = 5, min_rows = 0, max_rows = 0, db_type = DB_TYPE_INNODB, row_type = ROW_TYPE_DEFAULT, tmp_table = NO_TMP_TABLE, blob_ptr_size = 8, null_bytes = 1, last_null_bit_pos = 1, key_length = 7, fields = 1, rec_buff_length = 8, keys = 0, key_parts = 0, max_key_length = 0, max_unique_length = 0, total_key_length = 0, uniques = 0, null_fields = 0, blob_fields = 0, varchar_fields = 0, db_create_options = 8, db_options_in_use = 8, db_record_offset = 1, raid_type = 0, raid_chunks = 0, open_count = 0, primary_key = 64, timestamp_field_offset = 0, next_number_index = 0, next_number_key_offset = 0, frm_version = 10 '\n', system = 0 '\0', crypted = 0 '\0', db_low_byte_first = 1 '\001', crashed = 0 '\0', is_view = 0 '\0', name_lock = 0 '\0', replace_with_name_lock = 0 '\0', system_table = 0 '\0'} (gdb) Though a normal READ lock should be enough. Or does TL_WRITE_ALLOW_READ carry some extra information, like that other WRITE lock read requests should be cancelled (since we are building a new table). But I do not see how this could prevent SELECT queries from getting a READ lock on the table? (gdb) bt #0 ha_innobase::store_lock (this=0x1608320, thd=0x15c94b8, to=0x1608740, lock_type=TL_IGNORE) at ha_innodb.cc:6607 #1 0x0000000000607414 in get_lock_data (thd=0x15c94b8, table_ptr=0x44086910, count=1, flags=1, write_lock_used=0x44086928) at lock.cc:720 #2 0x0000000000607a28 in mysql_lock_abort (thd=0x15c94b8, table=0x1605098) at lock.cc:414 #3 0x000000000074b93f in wait_while_table_is_used (thd=0x15c94b8, table=0x1605098, function=HA_EXTRA_PREPARE_FOR_DELETE) at sql_table.cc:1947 #4 0x000000000074bf29 in close_cached_table (thd=0x15c94b8, table=0x1605098) at sql_table.cc:1976 #5 0x00000000007532c1 in mysql_alter_table (thd=0x15c94b8, new_db=0x1609468 "test", new_name=0x1609178 "t", create_info=0x44087be0, table_list=0x16091b0, alter_info=0x440882a0, order_num=0, order=0x0, ignore=false) at sql_table.cc:3850 #6 0x000000000062a714 in mysql_execute_command (thd=0x15c94b8) at sql_parse.cc:3256 #7 0x0000000000630d47 in mysql_parse (thd=0x15c94b8, inBuf=0x16090e8 "alter table t engine = innodb", length=29, found_semicolon=0x44088f10) at sql_parse.cc:6052 #8 0x00000000006336ae in dispatch_command (command=COM_QUERY, thd=0x15c94b8, packet=0x1601059 "alter table t engine = innodb", packet_length=30) at sql_parse.cc:1806 #9 0x0000000000634e3b in do_command (thd=0x15c94b8) at sql_parse.cc:1583 #10 0x00000000006361e1 in handle_one_connection (arg=0x15c94b8) at sql_parse.cc:1194 #11 0x00002ab31ff55f1a in start_thread () from /lib/libpthread.so.0 #12 0x00002ab3204fc602 in clone () from /lib/libc.so.6 #13 0x0000000000000000 in ?? () (gdb) I would classify this as a MySQL bug. MySQL should request a normal READ lock on the read table in an ALTER. --Heikki
[11 Jul 2007 18:39]
MySQL Verification Team
Oddly, I cannot repeat this hangup on my Linux 5.0.46 using same testcase which causes a hangup quickly on Windows box. Going to rebuild both servers now.
[11 Jul 2007 18:46]
MySQL Verification Team
testcase. point to windows box to get lock wait timeouts. see header of file for ip/user/password
Attachment: testcase.c (text/plain), 6.99 KiB.
[11 Jul 2007 19:35]
MySQL Verification Team
Heikki, some more info. I hope it explains this bug completely. 5.0.22 works without lock wait timeout. >= 5.0.24a has hang+lock wait timeouts. This seems windows-specific. Looking at the change logs, I came across the patch and comment for bug #17264 . In ha_innodb.cc #ifdef __WIN__ /* For alter table on win32 for succesful operation completion it is used TL_WRITE(=10) lock instead of TL_WRITE_ALLOW_READ(=6), however here in innodb handler TL_WRITE is lifted to TL_WRITE_ALLOW_WRITE, which causes race condition when several clients do alter table simultaneously (bug #17264). This fix avoids the problem. */ && thd->lex->sql_command != SQLCOM_ALTER_TABLE #endif && thd->lex->sql_command != SQLCOM_CREATE_TABLE) { lock_type = TL_WRITE_ALLOW_WRITE; }
[17 Jul 2007 14:10]
Heikki Tuuri
Shane, ok, now I understand why MySQL uses a WRITE_something type lock on the read table. It is there to stop other, concurrent ALTERs! Though I do not know why this is only relevant for Windows. In the Windows case, I do not understand why the SELECTs are not able to read the read table during the ALTER, that is, why MySQL blocks waiting for a table lock? If the table lock mode is WRITE_ALLOW_READ, why then normal READ locks have to wait? On the InnoDB side, we could fix that WRITE_ALLOW_READ would cause only S-locking of rows if we are doing an ALTER. Though that sounds like a hack. MySQL should actually have a lock type READ_DO_NOT_ALLOW_ALTER for the read table... Regards, Heikki
[20 Jul 2007 10:15]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/31217 ChangeSet@1.2536, 2007-07-20 13:14:57+03:00, gkodinov@magare.gmz +4 -0 Bug #29644: alter table hangs if records locked in share mode by long running transaction On Windows opened files can't be deleted. There was a special upgraded lock mode (TL_WRITE instead of TL_WRITE_ALLOW_READ) in ALTER TABLE to remedy make sure nobody has the table opened when deleting the old table in ALTER TABLE. This special mode was causing ALTER TABLE to hang waiting on a lock inside InnoDB. This special lock is no longer necessary as the server is closing the tables it needs to delete in ALTER TABLE. Fixed by removing the special lock. Note that this also reverses the fix for bug 17264 that deals with another consequence of this special lock mode being used.
[20 Jul 2007 11:17]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/31223 ChangeSet@1.2536, 2007-07-20 14:17:15+03:00, gkodinov@magare.gmz +4 -0 Bug #29644: alter table hangs if records locked in share mode by long running transaction On Windows opened files can't be deleted. There was a special upgraded lock mode (TL_WRITE instead of TL_WRITE_ALLOW_READ) in ALTER TABLE to make sure nobody has the table opened when deleting the old table in ALTER TABLE. This special mode was causing ALTER TABLE to hang waiting on a lock inside InnoDB. This special lock is no longer necessary as the server is closing the tables it needs to delete in ALTER TABLE. Fixed by removing the special lock. Note that this also reverses the fix for bug 17264 that deals with another consequence of this special lock mode being used.
[26 Jul 2007 5:55]
Bugs System
Pushed into 5.1.21-beta
[26 Jul 2007 5:57]
Bugs System
Pushed into 5.0.48
[27 Jul 2007 16:10]
Paul DuBois
Noted in 5.0.48, 5.1.21 changelogs. On Windows, ALTER TABLE hung if records were locked in share mode by a long-running transaction.
[5 May 2010 15:20]
Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 3:01]
Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[28 May 2010 6:03]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:32]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:59]
Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 19:24]
Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[17 Jun 2010 12:07]
Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:53]
Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:34]
Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)