Bug #67647 ALTER TABLE can block SELECT statements in certain conditions
Submitted: 20 Nov 2012 10:50 Modified: 12 Mar 2013 18:11
Reporter: Ovais Tariq Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.5.28 OS:Linux
Assigned to: Paul Dubois CPU Architecture:Any

[20 Nov 2012 10:50] Ovais Tariq
Description:
The documentation says that ALTER TABLE still allows other sessions to be able to read from the table whose ALTER is in progress, and that the table is only locked for writes. However, in certain conditions the ALTER TABLE blocks other select statements from executing. This has to do with how meta-data locking is implemented.

How to repeat:
This can be repeated as follows:

- Create a table and insert some data
create table t1(i int(11) not null auto_increment primary key, c char(32) not null default 'dummy_text') engine=innodb;
insert into t1(i) select null from t1; <-- execute this multiple times

- Now execute the following statements in different sessions as specified
session1 > start transaction;
session 1 > select * from t1 limit 10;
+----+------------+
| i  | c          |
+----+------------+
|  1 | dummy_text |
|  2 | dummy_text |
|  3 | dummy_text |
|  4 | dummy_text |
|  6 | dummy_text |
|  7 | dummy_text |
|  8 | dummy_text |
|  9 | dummy_text |
| 13 | dummy_text |
| 14 | dummy_text |
+----+------------+

session 2 > alter table t1 add key k1(c); <-- this blocks
Query OK, 0 rows affected (1 min 47.40 sec)
Records: 0  Duplicates: 0  Warnings: 0

session 3 > show processlist;
+----+----------+-----------+------+---------+------+---------------------------------+------------------------------+
| Id | User     | Host      | db   | Command | Time | State                           | Info                         |
+----+----------+-----------+------+---------+------+---------------------------------+------------------------------+
|  4 | msandbox | localhost | test | Sleep   |   49 |                                 | NULL                         |
|  5 | msandbox | localhost | test | Query   |   45 | Waiting for table metadata lock | alter table t1 add key k1(c) |
|  6 | msandbox | localhost | test | Sleep   |    9 |                                 | NULL                         |
|  7 | msandbox | localhost | test | Query   |    0 | NULL                            | show processlist             |
+----+----------+-----------+------+---------+------+---------------------------------+------------------------------+
4 rows in set (0.00 sec)
session 3 > start transaction;
session 3 > select * from t1 limit 5; <-- this blocks

session 4 > show processlist;
+----+----------+-----------+------+---------+------+---------------------------------+------------------------------+
| Id | User     | Host      | db   | Command | Time | State                           | Info                         |
+----+----------+-----------+------+---------+------+---------------------------------+------------------------------+
|  4 | msandbox | localhost | test | Sleep   |   85 |                                 | NULL                         |
|  5 | msandbox | localhost | test | Query   |   81 | Waiting for table metadata lock | alter table t1 add key k1(c) |
|  7 | msandbox | localhost | test | Query   |   30 | Waiting for table metadata lock | select * from t1 limit 5     |
|  8 | msandbox | localhost | test | Query   |    0 | NULL                            | show processlist             |
+----+----------+-----------+------+---------+------+---------------------------------+------------------------------+
4 rows in set (0.00 sec)
session 4 > start transaction;
session 4 > select * from t1 limit 5; <-- this blocks

Suggested fix:
This behaviour looks unexpected to me, the selects should not block. And till the time that this behaviour cannot be changed the documentation of ALTER TABLE (http://dev.mysql.com/doc/refman/5.5/en/alter-table.html) should specify that it can block even selects in some situations as shown above.
[21 Nov 2012 5:12] Ovais Tariq
This behaviour of selects waiting on meta-data locks, specifically happens when a new session opens the table for reading and the table is not already in the table cache
[21 Nov 2012 8:34] Dmitry Lenev
Hello Ovais!

This is an expected behavior and not a bug in code. But I agree that our documentation can be improved to make it more clear.

At certain point ALTER TABLE needs to acquire exclusive lock on table to install a new version of .FRM and to get rid of outdated TABLE/TABLE_SHARE/handler instances in Table and Table Definition caches. At this point it will wait for existing SELECTs to stop and will block any new SELECTs. If we fail to do the latter there is a chance that ALTER TABLE might be starved by concurrent stream of SELECTs and won't finish any time soon.

I am verifying this as a documentation bug.
[21 Nov 2012 8:37] Ovais Tariq
Dmitry,

If you run my test case you will see that the ALTER blocks all the other SELECT queries, while it is still waiting for its turn to execute, so I do not think its correct to say that ALTER only takes the exclusive lock when it has to change the .FRM file. ALTER seems to be blocking the other selects before the ALTER itself has started.
[21 Nov 2012 8:59] Ovais Tariq
Hello Dmitry,

Actually I spoke too soon, ALTER TABLE does actually only block when its about to swap the old .FRM file with the new one, so you are correct on that :)

However, what I have noticed is that ALTER does not only wait for the SELECTs to be completed but rather it waits for a transaction that has the table open to be either committed or rolled-back, before the ALTER can proceed. Another interesting thing is that if I start two transactions and both execute a SELECT on the same table, the ALTER only blocks for as long as one of those transaction is active, which is kind of strange.
[21 Nov 2012 9:05] Dmitry Lenev
Hi Ovais!

Yes, ALTER TABLE will be waiting for not only running SELECTs but for transactions to complete. This is expected (and needed to ensure that ALTER TABLE and transaction in question are properly isolated and written to binary log). Regarding case with two transactions, can you please provide a repeatable test case?
[21 Nov 2012 9:50] Ovais Tariq
A test case for when ALTER only waits for one of the open transaction to finish is simple enough, and can be repeated as follows:

session1 > start transaction;
session1 > select * from t1 limit 10;

session2 > start transaction;
session2 > select * from t1 limit 10;

session3 > alter table t1 add index k1(c); <-- blocks

session 4> select trx_id, trx_state from information_schema.innodb_trx; <-- list of running transactions
+--------+-----------+
| trx_id | trx_state |
+--------+-----------+
| D02    | RUNNING   |
| D01    | RUNNING   |
+--------+-----------+
2 rows in set (0.00 sec)

session 4 > show processlist; <-- processlist showing ALTER waiting on metadata lock
+----+----------+-----------+------+---------+------+---------------------------------+--------------------------------+
| Id | User     | Host      | db   | Command | Time | State                           | Info                           |
+----+----------+-----------+------+---------+------+---------------------------------+--------------------------------+
|  7 | msandbox | localhost | test | Sleep   |  124 |                                 | NULL                           |
|  8 | msandbox | localhost | test | Sleep   |  106 |                                 | NULL                           |
| 10 | msandbox | localhost | test | Query   |   29 | Waiting for table metadata lock | alter table t1 add index k1(c) |
| 11 | msandbox | localhost | test | Query   |    0 | NULL                            | show processlist               |
+----+----------+-----------+------+---------+------+---------------------------------+--------------------------------+

session1 > commit; <-- we commit one of the transactions
Query OK, 0 rows affected (0.00 sec)

session 4 > show processlist; <-- the ALTER has finished successfully
+----+----------+-----------+------+---------+------+-------+------------------+
| Id | User     | Host      | db   | Command | Time | State | Info             |
+----+----------+-----------+------+---------+------+-------+------------------+
|  7 | msandbox | localhost | test | Sleep   |    5 |       | NULL             |
|  8 | msandbox | localhost | test | Sleep   |  120 |       | NULL             |
| 10 | msandbox | localhost | test | Sleep   |   43 |       | NULL             |
| 11 | msandbox | localhost | test | Query   |    0 | NULL  | show processlist |
+----+----------+-----------+------+---------+------+-------+------------------+

session 4 > select trx_id, trx_state from information_schema.innodb_trx; <-- although one of the transactions is still active
+--------+-----------+
| trx_id | trx_state |
+--------+-----------+
| D02    | RUNNING   |
+--------+-----------+
1 row in set (0.00 sec)
[21 Nov 2012 9:59] Dmitry Lenev
Hi Ovais!

Could you please clarify if you have query cache enabled?
[27 Nov 2012 11:54] Ovais Tariq
Hi Dmitry,

Yes the query_cache is enabled.
[29 Nov 2012 15:21] Dmitry Lenev
Hello Ovais!

Thanks for feedback!

After some thinking I have reported issue with second transactions not blocking ALTER TABLE (with a bit different test case) as a separate Bug #67759 "ALTER TABLE breaks REPEATABLE READ when Query Cache is enabled". Please use this new bug to track this problem.
[12 Mar 2013 18:11] Paul Dubois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

Added this to ALTER TABLE discussion of session blocking:

ALTER TABLE blocks reads (not just writes) at the point where it is
ready to install a new version of the table .frm file, discard the
old file, and clear outdated table structures from the table and
table definition caches. At this point, it must acquire an exclusive
lock. To do so, it waits for current readers to finish, and blocks
new reads (and writes).