Bug #93761 optimize table cause the slave MTS deadlock!
Submitted: 29 Dec 2018 2:45 Modified: 9 Mar 23:35
Reporter: yoga yoga Email Updates:
Status: Verified Impact on me:
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.7.22 OS:Red Hat (6.5)
Assigned to: CPU Architecture:Any
Tags: MTS, optimize table

[29 Dec 2018 2:45] yoga yoga
version: mysql5.7.22
Our replication configuration is as follows:

we find the slave sql thread does not worker, the show processlist is:
mysql> show processlist;
| Id | User        | Host      | db   | Command | Time | State                                       | Info                          |
| 37 | root        | localhost | NULL | Query   |    0 | starting                                    | show processlist              |
| 38 | system user |           | NULL | Connect | 3044 | Waiting for master to send event            | NULL                          |
| 39 | system user |           | NULL | Connect | 2695 | Waiting for dependent transaction to commit | NULL                          |
| 40 | system user |           |      | Connect | 2749 | Waiting for table metadata lock             | optimize table sbtest.sbtest1 |
| 41 | system user |           | NULL | Connect | 2652 | Waiting for preceding transaction to commit | NULL                          |
| 42 | system user |           | NULL | Connect | 2652 | Waiting for preceding transaction to commit | NULL 

the master:
we find the optimize table binlog has the same last_commited num with the same table dml sql.
last_committed=8101     sequence_number=8102          optimize table sbtest.sbtest1
last_committed=8101     sequence_number=8103
last_committed=8101     sequence_number=8104
last_committed=8101     sequence_number=8105
last_committed=8101     sequence_number=8106
last_committed=8101     sequence_number=8107
last_committed=8101     sequence_number=8108
last_committed=8101     sequence_number=8109
last_committed=8101     sequence_number=8110          update the sbtest.sbtest1   
the slave:
the seq_num in [8102, 8110] can work in parallel, but if the follow happens, it will deadlock:
[sequence_number=8110, wait for the 8102 to do first, and alse have the mdl_share lock of table sbtest.sbtest1]
[sequence_number=8102, wait for the mdl_share of sbtest.sbtest1 upgraded to mdl_exclusive lock]
MDL_context::upgrade_shared_lock (this=0x10ff60a8, mdl_ticket=0x13db7a90, new_type=MDL_EXCLUSIVE,lock_wait_timeout=31536000)
wait_while_table_is_used (thd=0x10ff6000, table=0x132ac020, function=HA_EXTRA_PREPARE_FOR_RENAME)

1. how to fix?
2. we verify alter table sbtest.sbtest1 engine=innodb does not has this problem, why?
3. other online ddl has the same problems?

How to repeat:
1: we should have the same master-slave mysql.

2: do the sysbench test in the master.
(you should prepare the 16 tables of sbtest before)
   ./sysbench --test=/home/sysbench/db/oltp.lua --oltp-table-size=10000000 --mysql-table-engine=innodb --mysql-user=root  --mysql-password=123456 --mysql-port=3306 --mysql-host= --mysql-db=sbtest --oltp-tables-count=16 --max-requests=0 --max-time=20000 --num-threads=50 --report-interval=10 run

3:exec the follow shell in the master:
while true
  echo 'optimize table sbtest.sbtest1;'
  mysql -uroot -p123456 -e'optimize table sbtest.sbtest1;'
  sleep 1
  echo 'optimize table sbtest.sbtest2;'
  mysql -uroot -p123456 -e'optimize table sbtest.sbtest2;'
  sleep 2

4: we execute show processlist in slave to find whether the MTS deadlock has happened?
[21 Jan 6:58] Jean-François Gagné
I suggest the following workaround: replace your optimize table with an ALTER TABLE ENGINE INNODB.  Another work-arount could be to disable slave_preserve_commit_order, but you might not want to do that.

It is not actually the optimize table that cause the deadlock, it is a combinasion of enabling slave_preserve_commit_order and of the optimize not being considered as a barrier when generating the logical clocks (last_committed and sequence_number), but I think an ALTER TABLE is.

Long story made short, DDL should not be run in parallel on slaves.  When written to the binlogs, a DDL (including an ALTER TABLE or a CREATE TABLE) should force all following last_committed to be greater or equal to the sequence_number of this DDL.  Think of it this way: you cannot run an INSER in a TABLE before the CREATE TABLE is run.
[7 Mar 4:14] Bogdan Kecman

Not a bug, a design limitation maybe but a bug - no. 
Jean-François already explained why (Thanks!!). 


p.s. this type of issues are handled differently (more safely) in 8.x due to DDL's being transactional
[9 Mar 23:35] Bogdan Kecman

I discussed this a little bit with colleagues and I change my mind, I think this actually is a bug. Probbly only that's not going to be fixed since 8.x fixes it and to fix in 5.7 we'd need a change too big to allow to enter into 5.7 but still I'll verify it.

all best