Bug #59134 DDL hang on ALTER TABLE in 5.5.8, but not in 5.1, 5.0, or 4.1
Submitted: 23 Dec 2010 15:11 Modified: 2 Jun 2015 7:56
Reporter: Justin Cranford Email Updates:
Status: Can't repeat Impact on me:
Category:MySQL Server: DDL Severity:S4 (Feature request)
Version:5.5 OS:Microsoft Windows
Assigned to: CPU Architecture:Any

[23 Dec 2010 15:11] Justin Cranford
I am testing MySQL 5.5.8 compatibility of my company's enterprise-grade Java product (Privileged Password Management & Identify Management). The product has previously been rigorously tested against multiple versions of 4 different databases.

MySQL 4.1/5.0/5.1
Oracle 10.1/10.2/11.1/11.2
DB2 9.7
SQL Server 2005/2008

When I tried MySQL 5.5.8, I found a DDL statement was hanging due to another dangling transaction being accidently left open. I fixed the code bug in my product, but the default transaction isolation level is READ COMMITTED and it should not have blocked in the first place. It does not block in any other version of MySQL tested, or in any other database vendor tested. It only blocks in 5.5.8, so this appears to be a new bug in 5.5.x.

Here is the output of the "SHOW ENGINE INNODB STATUS" command which revealed the problem in the product I was testing. See how "MySQL thread id 42, query id 12212" is blocked by "Waiting for table metadata lock". It is blocked by "MySQL thread id 37, query id 11867".

Query 11867: SELECT * FROM <TABLE> WHERE id=?; (dangling uncommitted TX)

*************************** 1. row ***************************
  Type: InnoDB
Per second averages calculated from the last 21 seconds
srv_master_thread loops: 297 1_second, 297 sleeps, 28 10_second, 31 background,
31 flush
srv_master_thread log flush and writes: 312
OS WAIT ARRAY INFO: reservation count 143, signal count 146
Mutex spin waits 139, rounds 1512, OS waits 46
RW-shared spins 50, rounds 1500, OS waits 50
RW-excl spins 1, rounds 1425, OS waits 47
Spin rounds per wait: 10.88 mutex, 30.00 RW-shared, 1425.00 RW-excl
Trx id counter 28363
Purge done for trx's n:o < 2819D undo n:o < 0
History list length 183
---TRANSACTION 2835C, not started, OS thread id 6104
mysql tables in use 1, locked 1
MySQL thread id 42, query id 12212 caott-dtfac420.domain1.intra cspma
dmin Waiting for table metadata lock
alter table init_properties modify column au_createdate timestamp null
---TRANSACTION 2815C, not started, OS thread id 6108
MySQL thread id 40, query id 11909 localhost cspmadmin
---TRANSACTION 28153, not started, OS thread id 6088
MySQL thread id 15, query id 12216 localhost ::1 root
---TRANSACTION 2815A, ACTIVE 935 sec, OS thread id 7032
MySQL thread id 37, query id 11867 caott-dtfac420.domain1.intra cspmp
Trx read view will not see trx with id >= 2815B, sees < 2815B
I/O thread 0 state: wait Windows aio (insert buffer thread)
I/O thread 1 state: wait Windows aio (log thread)
I/O thread 2 state: wait Windows aio (read thread)
I/O thread 3 state: wait Windows aio (read thread)
I/O thread 4 state: wait Windows aio (read thread)
I/O thread 5 state: wait Windows aio (read thread)
I/O thread 6 state: wait Windows aio (write thread)
I/O thread 7 state: wait Windows aio (write thread)
I/O thread 8 state: wait Windows aio (write thread)
I/O thread 9 state: wait Windows aio (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] ,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
737 OS file reads, 10432 OS file writes, 3600 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 180647, node heap has 1 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
Log sequence number 31998037
Log flushed up to   31998037
Last checkpoint at  31998037
0 pending log writes, 0 pending chkp writes
3419 log i/o's done, 0.00 log i/o's/second
Total memory allocated 93372416; in additional pool allocated 0
Dictionary memory allocated 313045
Buffer pool size   5568
Free buffers       4959
Database pages     608
Old database pages 210
Modified db pages  0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 412, created 196, written 6554
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s
LRU len: 608, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
0 queries inside InnoDB, 0 queries in queue
2 read views open inside InnoDB
Main thread id 5948, state: waiting for server activity
Number of rows inserted 1383, updated 406, deleted 228, read 9362
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s

How to repeat:
The blocked DDL issue is easily reproducible on the command line. This only blocks on 5.5.8. It does not block on 5.1, 5.0 or 4.1.

*** Session 1 - create row-level read lock on a table ***

C:\>mysql -uroot
mysql> create database dltest;
mysql> use dltest;
mysql> create table dltable (id int not null primary key) engine=innodb;
mysql> insert into dltable values(1);
mysql> commit; --make sure no transaction is open
mysql> begin;
mysql> select * from dltable where id=1;
| id |
|  1 |

*** Session 2 - add column blocked, even if tx-iso-lvl is READ COMMITTED ***

C:\>mysql -uuser1 -pmypwd
mysql> use dltest;
mysql> ALTER TABLE dltable add column newcol varchar(255) null;

*** Session 1 - release row-level lock ***
mysql> commit;

Note: Releasing the row-level lock in session 1 unblocked the ALTER TABLE statement in session 2 without any interaction with session 2.

*** Session 2 - release all locks ***
mysql> commit;
[23 Dec 2010 16:26] Valeriy Kravchuk
I think this manual page, http://dev.mysql.com/doc/refman/5.5/en/metadata-locking.html, explains change that happened in 5.5.x, and explains the reasons for it. Please, check.
[23 Dec 2010 18:54] Justin Cranford
That new 5.5 manual describes the exact behaviour I found. Thanks.

Based on this new information, I changed the severity from S2 (serious) to S4 (feature request). I still have a concern though, and a suggestion for an enhancement.

Is the table metadata lock configurable? After all, we are talking about a single, row-level, read-only, DML statement blocking *all* DDL statements. What this means is a complete system test will now be required to validation our product on MySQL 5.5. There was nothing in the release notes to indicate this would be necessary, but the devil is in the details.

At the very least, you should consider an enhancement to make the "table metadata locking" more granular. In the example I provided, the two statements are orthagonal (no overlap). It definitely makes sense to block on *some* DDL statements, but not *all*.

My concern stems somewhat from a similar experience with validating DB2 9.1 and 9.5 for the same product. We had to abandon support for those versions because table level locks like this were incompatible with the product. Only the Oracle-style locking in 9.7 allowed us to support any version of DB2 at all.

In that particular case, a row-level UPDATE in DB2 9.1/9.5 would block all read operations in other transactions. In particular, the code was trying to read MAX(id) in a different thread. That value should be in the table's metadata, but the transaction was blocked by another thread holding a row-level exclusive lock. It was not even updating the id column, so the statements were orthagonal. Anyway, I would put this new MySQL 5.5.3 behaviour is in the same category. It could definitely lead to some unexpected problems.

Anyway, getting back to the point about configuration. If this behaviour change is for the binary log only, what about the case where the binary log is turned off. The binary log is only used for replication and restore, but those features are not always required. Perhaps the metadata table lock should only be used if-and-only-if the binary log is turned on?

In summary, please consider the following enhancements for MySQL 5.5...

1) More granular metatable locking to avoid blocking orthagonal access, and
2) Configurable metadata table lock depending (only if binary log turned on)
[24 Dec 2010 9:21] MySQL Verification Team
does it timeout after --lock_wait_timeout expires?
[27 Jan 2011 11:42] Dmitry Lenev
Hello Justin!

Regarding your suggestions:

"1) More granular metatable locking to avoid blocking orthagonal access"

Currently metadata locks have table-level granularity. So open transaction can block only DDL on tables which are used in this transaction. It should not block DDL on unrelated tables.

If you observe behavior when open transaction is blocking other, unrelated DDL then it is definitely a bug.

Could you please confirm if it is or it is not the case?

Thank you in advance!

"2) Configurable metadata table lock depending (only if binary log turned on)"

Metadata locking is necessary not only because of binary logging. It is also simplifies storage engines implementation and InnoDB in particular (it can rely on the fact that table used by some transaction won't be dropped while transaction is active).
[28 Feb 2011 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".