Description:
Manual (https://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-concurrency.html) says:
"An online DDL statement for an InnoDB table always waits for currently executing transactions that are accessing the table to commit or roll back, because it requires exclusive access to the table for a brief period while the DDL statement is being prepared. Likewise, it requires exclusive access to the table for a brief time before finishing. Thus, an online DDL statement waits for any transactions that are started while the DDL is in progress, and query or modify the table, to commit or roll back before the DDL completes."
From the above one may conclude that if there is some lock set on the table we want to ALTER at the moment we start ALTER, we do nothing but wait for the transaction to complete. In reality, we sometimes may do proceed with online ALTER and then ALTER is blocked at the very end, when exclusive table level lock is requested, while holding the metadata lock on the table ALTERed. This may cause massive locking of other sessions if innodb_lock_wait timeout is large enough.
See "How to repeat" for specific details.
How to repeat:
Create the following tables and add rows:
create table t1(id int primary key, data int) engine=InnoDB;
create table t2(id int primary key, data int, t1_id int, constraint fk_t1
foreign key(t1_id) references t1(id)) engine=InnoDB;
insert into t1 values (1,1);
insert into t2 values (1,1,1);
Now, in session 1 do:
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into t2 values (2,1,1);
Query OK, 1 row affected (0.01 sec)
In session 2 do:
mysql> alter table t1 add column data2 int, algorithm=INPLACE, lock=NONE;
ALTER hangs (as expected), the problem is the stage it hangs on:
mysql> show processlist;
+----+------+-----------------+------+---------+------+-------------------------
-----------------+--------------------------------------------------------------
-----+
| Id | User | Host | db | Command | Time | State
| Info
|
+----+------+-----------------+------+---------+------+-------------------------
-----------------+--------------------------------------------------------------
-----+
| 1 | root | localhost:49835 | test | Sleep | 70 |
| NULL
|
| 2 | root | localhost:49850 | test | Query | 0 | init
| show processlist
|
| 3 | root | localhost:49853 | test | Query | 7 | committing alter table t
o storage engine | alter table t1 add column data2 int, algorithm=INPLACE, lock=
NONE |
+----+------+-----------------+------+---------+------+-------------------------
-----------------+--------------------------------------------------------------
-----+
3 rows in set (0.00 sec)
So, it's final stage and it requires X level table lock. If in the first (or any other) session we try to access the table:
update t1 set data = 2 where id=1;
this will hang waiting for table metadata lock:
mysql> show processlist;
+----+------+-----------------+------+---------+------+-------------------------
-----------------+--------------------------------------------------------------
-----+
| Id | User | Host | db | Command | Time | State
| Info
|
+----+------+-----------------+------+---------+------+-------------------------
-----------------+--------------------------------------------------------------
-----+
| 1 | root | localhost:49835 | test | Query | 3 | Waiting for table metada
ta lock | update t1 set data = 2 where id=1
|
| 2 | root | localhost:49850 | test | Query | 0 | init
| show processlist
|
| 3 | root | localhost:49853 | test | Query | 30 | committing alter table t
o storage engine | alter table t1 add column data2 int, algorithm=INPLACE, lock=
NONE |
+----+------+-----------------+------+---------+------+-------------------------
-----------------+--------------------------------------------------------------
-----+
3 rows in set (0.00 sec)
and these tow session will hang on each other for the time equals to min(lock_wait_timeout, innodb_lock_wait_timeout). The later is smaller by default and eventually ALTER fails:
mysql> alter table t1 add column data2 int, algorithm=INPLACE, lock=NONE;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql>
The locks involved at InnoDB level are:
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 81435, not started
MySQL thread id 3, OS thread handle 0x11d8, query id 18 localhost ::1 root committing alter table to storage engine
alter table t1 add column data2 int, algorithm=INPLACE, lock=NONE
---TRANSACTION 81434, ACTIVE 51 sec setting table lock for creating or dropping index
mysql tables in use 1, locked 1
LOCK WAIT 1 lock struct(s), heap size 360, 0 row lock(s)
MySQL thread id 3, OS thread handle 0x11d8, query id 18 localhost ::1 root committing alter table to storage engine
alter table t1 add column data2 int, algorithm=INPLACE, lock=NONE
Trx read view will not see trx with id >= 81436, sees < 81429
------- TRX HAS BEEN WAITING 50 SEC FOR THIS LOCK TO BE GRANTED:
TABLE LOCK table `test`.`t1` trx id 81434 lock mode X waiting
------------------
TABLE LOCK table `test`.`t1` trx id 81434 lock mode X waiting
---TRANSACTION 81429, ACTIVE 114 sec
3 lock struct(s), heap size 360, 1 row lock(s), undo log entries 1
MySQL thread id 1, OS thread handle 0x1224, query id 20 localhost ::1 root Waiting for table metadata lock
update t1 set data = 2 where id=1
TABLE LOCK table `test`.`t2` trx id 81429 lock mode IX
TABLE LOCK table `test`.`t1` trx id 81429 lock mode IS
RECORD LOCKS space id 531 page no 3 n bits 72 index `PRIMARY` of table `test`.`t1` trx id 81429 lock mode S locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
0: len 4; hex 80000001; asc ;;
1: len 6; hex 000000013e12; asc > ;;
2: len 7; hex 8f000001c20110; asc ;;
3: len 4; hex 80000001; asc ;;
This case is really hard to debug and it would be nice to explain all kinds of InnoDB and metadata locks set by online ALTER, in order.
Suggested fix:
Please, explain all metadata and InnoDB locks set by online ALTER, with examples and details enough to explain non-trivial cases like the one above. From current manual content
"An online DDL statement for an InnoDB table always waits for currently executing transactions that are accessing the table to commit or roll back, because it requires exclusive access to the table for a brief period while the DDL statement is being prepared"
one may expect that ALTER is locked at the initial stage and thus can NOT hold any blocking metadata lock.