MySQL Bugs: #29644: alter table hangs if records locked in share mode by long running transaction

Bug #29644	alter table hangs if records locked in share mode by long running transaction
Submitted:	9 Jul 2007 10:39	Modified:	18 Jun 2010 1:01
Reporter:	Shane Bester (Platinum Quality Contributor)	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server	Severity:	S2 (Serious)
Version:	>=5.0.24	OS:	Windows (XP)
Assigned to:	Georgi Kodinov	CPU Architecture:	Any
Tags:	bfsm_2007_07_19

Description:
Tested on 5.0.42 and 5.0.46BK, and the same hang occurs. There is no disk i/o and no cpu activity.

Problem does not happen in 5.0.22.  

Running in SERIALIZABLE isolation level, we have three transactions running over and over:

1) ALTER TABLE <table> ENGINE=INNODB
2) SELECT * FROM <table> WHERE keycol='val' LOCK IN SHARE MODE
3) SELECT * FROM <table> WHERE keycol='val' LOCK IN SHARE MODE

One of the select statements eventually causes the alter table to hang.
Everything ends up locked, until innodb_lock_wait_timeout expires for the selects. If you KILL a select statement, then the alter continues.

See a processlist example:

mysql> show processlist;                        
------+-------------------+-------------------  
 Time | State             | Info                
------+-------------------+-------------------  
    0 | NULL              | show processlist    
   10 |                   | NULL                
  344 | copy to tmp table | ALTER TABLE t1_     
  344 | Locked            | SELECT * FROM "t1   
  344 | Locked            | SELECT * FROM "t1   
------+-------------------+-------------------  

Here are the reformatted stack traces of the threads from debugger:

ALTER QUERY:
------------
os_event_wait()  Line 351
srv_suspend_mysql_thread()  Line 1477
row_mysql_handle_errors()  Line 507
row_search_for_mysql()  Line 4067
ha_innobase::general_fetch()  Line 3990
ha_innobase::rnd_next()  Line 4183
rr_sequential()  Line 295
copy_data_between_tables()  Line 4163
mysql_alter_table()  Line 3786
mysql_execute_command()  Line 3262
mysql_parse()  Line 6083
dispatch_command()  Line 1812
do_command()  Line 1586
handle_one_connection()  Line 1197
pthread_start()  Line 62
_callthreadstart()  Line 293
_threadstart()  Line 277

SELECT QUERY:
--------------
pthread_cond_timedwait()  Line 124
pthread_cond_wait()  Line 71
wait_for_lock()  Line 415
thr_lock()  Line 679
thr_multi_lock()  Line 929
mysql_lock_tables()  Line 171
lock_tables()  Line 3181
open_and_lock_tables()  Line 3034
mysql_execute_command()  Line 2631
mysql_parse()  Line 6083
dispatch_command()  Line 1812
do_command()  Line 1586
handle_one_connection()  Line 1197
pthread_start()  Line 62
_callthreadstart()  Line 293
_threadstart()  Line 277

How to repeat:
Will upload a testcase later.

Suggested fix:
Not sure. Why doesn't 5.0.22 show the same behavior?

Shane,

what is the stack trace of the second SELECT query?

My first guess is that the ALTER sets X-locks on the rows it reads from the table. The question is why it does this? Then the ALTER stops waiting for the SELECT to release its S-locks.

I will check inside gdb what happens in an ALTER.

--Heikki

Heikki, I uploaded more info in private comment.  I'll upload a .c file later that will reproduce this easily in 1 minute of running.

Regarding the uploaded info, I have explicitly set innodb lock wait timeout to a much higher value than normal, in order to force the hangup to wait longer before the select times out.

Shane,

ok, the ALTER does set X-locks on the rows of read table! There is no need for that, S-locks would be enough.

The SELECTs on the other hand are waiting for a MySQL table READ lock on the read table. Why do not they get it? Maybe the ALTER sets a MySQL WRITE lock on the read table, though a READ lock would be enough?

--Heikki

MySQL does take a WRITE lock on the read table in an ALTER:

#0  ha_innobase::store_lock (this=0x1608320, thd=0x15c94b8, to=0x16089a0,
    lock_type=TL_WRITE_ALLOW_READ) at ha_innodb.cc:6681
#1  0x0000000000607414 in get_lock_data (thd=0x15c94b8, table_ptr=0x1609248,
    count=1, flags=2, write_lock_used=0x44086978) at lock.cc:720
#2  0x000000000060807a in mysql_lock_tables (thd=0x15c94b8, tables=0x1609248,
    count=1, flags=0, need_reopen=0x440869e7) at lock.cc:125
#3  0x000000000066506e in open_ltable (thd=0x15c94b8, table_list=0x16091b0,
    lock_type=TL_WRITE_ALLOW_READ) at sql_base.cc:2953
#4  0x0000000000750f77 in mysql_alter_table (thd=0x15c94b8,
    new_db=0x1609468 "test", new_name=0x0, create_info=0x44087be0,
    table_list=0x16091b0, alter_info=0x440882a0, order_num=0, order=0x0,
    ignore=false) at sql_table.cc:3138
#5  0x000000000062a714 in mysql_execute_command (thd=0x15c94b8)
    at sql_parse.cc:3256
#6  0x0000000000630d47 in mysql_parse (thd=0x15c94b8,
    inBuf=0x16090e8 "alter table t engine = innodb", length=29,
    found_semicolon=0x44088f10) at sql_parse.cc:6052
#7  0x00000000006336ae in dispatch_command (command=COM_QUERY, thd=0x15c94b8,
    packet=0x1601059 "alter table t engine = innodb", packet_length=30)
    at sql_parse.cc:1806
#8  0x0000000000634e3b in do_command (thd=0x15c94b8) at sql_parse.cc:1583
#9  0x00000000006361e1 in handle_one_connection (arg=0x15c94b8)
    at sql_parse.cc:1194
#10 0x00002ab31ff55f1a in start_thread () from /lib/libpthread.so.0
#11 0x00002ab3204fc602 in clone () from /lib/libc.so.6
#12 0x0000000000000000 in ?? ()
(gdb)

(gdb) print *this->table->s
$3 = {name_hash = {key_offset = 0, key_length = 0, records = 0, blength = 0,
    flags = 0, array = {buffer = 0x0, elements = 0, max_element = 0,
      alloc_increment = 0, size_of_element = 0}, get_key = 0, free = 0,
    charset = 0x0}, mem_root = {free = 0x0, used = 0x0, pre_alloc = 0x0,
    min_malloc = 0, block_size = 0, block_num = 0, first_block_usage = 0,
    error_handler = 0}, keynames = {count = 0, name = 0x0,
    type_names = 0x1608508, type_lengths = 0x0}, fieldnames = {count = 1,
    name = 0x0, type_names = 0x16084f8, type_lengths = 0x0}, intervals = 0x0,
  blob_field = 0x0, default_values = 0x16084c8 "ÿ", comment = {
    str = 0x16084e0 "", length = 0}, table_charset = 0xe4f080,
  table_cache_key = 0x16085b0 "test", db = 0x16085b0 "test",
  table_name = 0x16082f8 "t", path = 0x1608300 "./test/t", connect_string = {
    str = 0x1608318 "", length = 0}, keys_in_use = {map = 0},
  keys_for_keyread = {map = 0}, avg_row_length = 0, raid_chunksize = 0,
  version = 1, flush_version = 1, mysql_version = 50044, timestamp_offset = 0,
  reclength = 5, min_rows = 0, max_rows = 0, db_type = DB_TYPE_INNODB,
  row_type = ROW_TYPE_DEFAULT, tmp_table = NO_TMP_TABLE, blob_ptr_size = 8,
  null_bytes = 1, last_null_bit_pos = 1, key_length = 7, fields = 1,
  rec_buff_length = 8, keys = 0, key_parts = 0, max_key_length = 0,
  max_unique_length = 0, total_key_length = 0, uniques = 0, null_fields = 0,
  blob_fields = 0, varchar_fields = 0, db_create_options = 8,
  db_options_in_use = 8, db_record_offset = 1, raid_type = 0, raid_chunks = 0,
  open_count = 0, primary_key = 64, timestamp_field_offset = 0,
  next_number_index = 0, next_number_key_offset = 0, frm_version = 10 '\n',
  system = 0 '\0', crypted = 0 '\0', db_low_byte_first = 1 '\001',
  crashed = 0 '\0', is_view = 0 '\0', name_lock = 0 '\0',
  replace_with_name_lock = 0 '\0', system_table = 0 '\0'}
(gdb)

Though a normal READ lock should be enough. Or does TL_WRITE_ALLOW_READ carry some extra information, like that other WRITE lock read requests should be cancelled (since we are building a new table).

But I do not see how this could prevent SELECT queries from getting a READ lock on the table?

(gdb) bt
#0  ha_innobase::store_lock (this=0x1608320, thd=0x15c94b8, to=0x1608740,
    lock_type=TL_IGNORE) at ha_innodb.cc:6607
#1  0x0000000000607414 in get_lock_data (thd=0x15c94b8, table_ptr=0x44086910,
    count=1, flags=1, write_lock_used=0x44086928) at lock.cc:720
#2  0x0000000000607a28 in mysql_lock_abort (thd=0x15c94b8, table=0x1605098)
    at lock.cc:414
#3  0x000000000074b93f in wait_while_table_is_used (thd=0x15c94b8,
    table=0x1605098, function=HA_EXTRA_PREPARE_FOR_DELETE) at sql_table.cc:1947
#4  0x000000000074bf29 in close_cached_table (thd=0x15c94b8, table=0x1605098)
    at sql_table.cc:1976
#5  0x00000000007532c1 in mysql_alter_table (thd=0x15c94b8,
    new_db=0x1609468 "test", new_name=0x1609178 "t", create_info=0x44087be0,
    table_list=0x16091b0, alter_info=0x440882a0, order_num=0, order=0x0,
    ignore=false) at sql_table.cc:3850
#6  0x000000000062a714 in mysql_execute_command (thd=0x15c94b8)
    at sql_parse.cc:3256
#7  0x0000000000630d47 in mysql_parse (thd=0x15c94b8,
    inBuf=0x16090e8 "alter table t engine = innodb", length=29,
    found_semicolon=0x44088f10) at sql_parse.cc:6052
#8  0x00000000006336ae in dispatch_command (command=COM_QUERY, thd=0x15c94b8,
    packet=0x1601059 "alter table t engine = innodb", packet_length=30)
    at sql_parse.cc:1806
#9  0x0000000000634e3b in do_command (thd=0x15c94b8) at sql_parse.cc:1583
#10 0x00000000006361e1 in handle_one_connection (arg=0x15c94b8)
    at sql_parse.cc:1194
#11 0x00002ab31ff55f1a in start_thread () from /lib/libpthread.so.0
#12 0x00002ab3204fc602 in clone () from /lib/libc.so.6
#13 0x0000000000000000 in ?? ()
(gdb)

I would classify this as a MySQL bug. MySQL should request a normal READ lock on the read table in an ALTER.

--Heikki

Oddly, I cannot repeat this hangup on my Linux 5.0.46 using same testcase which causes a hangup quickly on Windows box.  Going to rebuild both servers now.

testcase. point to windows box to get lock wait timeouts. see header of file for ip/user/password

Attachment: testcase.c (text/plain), 6.99 KiB.

Heikki, some more info.  I hope it explains this bug completely.

5.0.22 works without lock wait timeout.  >= 5.0.24a has hang+lock wait timeouts.
This seems windows-specific.    Looking at the change logs, I came across the patch and comment for bug #17264 .  

In ha_innodb.cc 

#ifdef __WIN__
        /* For alter table on win32 for succesful operation
        completion it is used TL_WRITE(=10) lock instead of
        TL_WRITE_ALLOW_READ(=6), however here in innodb handler
        TL_WRITE is lifted to TL_WRITE_ALLOW_WRITE, which causes
        race condition when several clients do alter table
        simultaneously (bug #17264). This fix avoids the problem. */
	&& thd->lex->sql_command != SQLCOM_ALTER_TABLE
#endif

	&& thd->lex->sql_command != SQLCOM_CREATE_TABLE) {
	lock_type = TL_WRITE_ALLOW_WRITE;
}

Shane,

ok, now I understand why MySQL uses a WRITE_something type lock on the read table. It is there to stop other, concurrent ALTERs! Though I do not know why this is only relevant for Windows.

In the Windows case, I do not understand why the SELECTs are not able to read the read table during the ALTER, that is, why MySQL blocks waiting for a table lock? If the table lock mode is WRITE_ALLOW_READ, why then normal READ locks have to wait?

On the InnoDB side, we could fix that WRITE_ALLOW_READ would cause only S-locking of rows if we are doing an ALTER. Though that sounds like a hack. MySQL should actually have a lock type READ_DO_NOT_ALLOW_ALTER for the read table...

Regards,

Heikki

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31217

ChangeSet@1.2536, 2007-07-20 13:14:57+03:00, gkodinov@magare.gmz +4 -0
  Bug #29644: alter table hangs if records locked in share mode 
  by long running transaction
  
  On Windows opened files can't be deleted. There was a special
  upgraded lock mode (TL_WRITE instead of TL_WRITE_ALLOW_READ) 
  in ALTER TABLE to remedy make sure nobody has the table opened
  when deleting the old table in ALTER TABLE. This special mode
  was causing ALTER TABLE to hang waiting on a lock inside InnoDB.
  This special lock is no longer necessary as the server is 
  closing the tables it needs to delete in ALTER TABLE.
  Fixed by removing the special lock.
  Note that this also reverses the fix for bug 17264 that deals with
  another consequence of this special lock mode being used.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31223

ChangeSet@1.2536, 2007-07-20 14:17:15+03:00, gkodinov@magare.gmz +4 -0
  Bug #29644: alter table hangs if records locked in share mode 
  by long running transaction
  
  On Windows opened files can't be deleted. There was a special
  upgraded lock mode (TL_WRITE instead of TL_WRITE_ALLOW_READ) 
  in ALTER TABLE to make sure nobody has the table opened
  when deleting the old table in ALTER TABLE. This special mode
  was causing ALTER TABLE to hang waiting on a lock inside InnoDB.
  This special lock is no longer necessary as the server is 
  closing the tables it needs to delete in ALTER TABLE.
  Fixed by removing the special lock.
  Note that this also reverses the fix for bug 17264 that deals with
  another consequence of this special lock mode being used.

Pushed into 5.1.21-beta

Pushed into 5.0.48

Noted in 5.0.48, 5.1.21 changelogs.

On Windows, ALTER TABLE hung if records were locked in share mode by
a long-running transaction.

Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.

Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)

Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.

Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)