Bug #88964 binlog deadlock appears while backing up MTS slave
Submitted: 19 Dec 2017 8:10 Modified: 26 Dec 2017 10:21
Reporter: lisa lisa Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.7.20 OS:CentOS (6)
Assigned to: MySQL Verification Team CPU Architecture:Any

[19 Dec 2017 8:10] lisa lisa
Description:
variables :slave_preserve_commit_order=1
innobackupex:2.4.5 Linux (x86_64) (revision id: e41c0be)
command:sudo ${innodbackup}  --defaults-file=${mycnf} --user=${user} --password=${password}  --slave-info --lock-wait-threshold=5 --lock-wait-query-type=all --lock-wait-timeout=60 --ftwrl-wait-timeout=60 --ftwrl-wait-threshold=60 --stream=tar /opt/tmp --no-timestamp  --socket=${socket} | lz4 -B4 | pv -q -L${netlimit} | nc $localhost $port
we have two problem, one list:
the host show processlist info
2017-12-15 09:55:28	297531	system user	 	 	Connect	1961580	Waiting for an event from Coordinator	 	1961580684	0
2017-12-15 09:55:28	297528	system user	 	 	Connect	1961580	Waiting for an event from Coordinator	 	1961580686	0
2017-12-15 09:55:28	297529	system user	 	 	Connect	1961580	Waiting for an event from Coordinator	 	1961580685	0
2017-12-15 09:55:28	297526	system user	 	 	Connect	13826	Waiting for an event from Coordinator	 	13826667	0
2017-12-15 09:55:28	297527	system user	 	 	Connect	26859	Waiting for an event from Coordinator	 	26858667	0
2017-12-15 09:55:28	297524	system user	 	 	Connect	10047	Waiting for backup lock	insert into sbtest      10047471	0
2017-12-15 09:55:28	297525	system user	 	 	Connect	10047	Waiting for preceding transaction to commit	 	10047470	0
2017-12-15 09:55:28	2006785	backup	localhost	 	Query	9440	Waiting for binlog lock	LOCK BINLOG FOR BACKUP	9440487	0
2017-12-15 09:55:28	297522	system user	 	 	Connect	1961580	Waiting for master to send event	 	1961580688	0
2017-12-15 09:55:28	297523	system user	 	 	Connect	10047	Waiting for dependent transaction to commit	 	10047468	0
it happen deadlock, lock long time util we kill thread no 2006785.sbtest is MyiSAM,backup execute "lock table" before "lock binlog".the slave thread had lock binlog.

another one:
the host show processlist info
2017-12-14 06:24:18	588	system user	 	 	Connect	3750333	Waiting for an event from Coordinator	 	3750332324	0
2017-12-14 06:24:18	586	system user	 	 	Connect	3750333	Waiting for an event from Coordinator	 	3750332325	0
2017-12-14 06:24:18	587	system user	 	 	Connect	3750333	Waiting for an event from Coordinator	 	3750332324	0
2017-12-14 06:24:18	584	system user	 	 	Connect	3750333	Waiting for an event from Coordinator	 	3750332325	0
2017-12-14 06:24:18	585	system user	 	 	Connect	3750333	Waiting for an event from Coordinator	 	3750332325	0
2017-12-14 06:24:18	582	system user	 	 	Connect	2515123	Waiting for an event from Coordinator	 	2515122828	0
2017-12-14 06:24:18	583	system user	 	 	Connect	3750333	Waiting for an event from Coordinator	 	3750332326	0
2017-12-14 06:24:18	580	system user	 	 	Connect	1839	Waiting for an event from Coordinator	 	1838976	0
2017-12-14 06:24:18	581	system user	 	 	Connect	9613	Waiting for an event from Coordinator	 	9612453	0
2017-12-14 06:24:18	578	system user	 	 	Connect	1304	Waiting for an event from Coordinator	 	1303357	0
2017-12-14 06:24:18	579	system user	 	 	Connect	1308	Waiting for an event from Coordinator	 	1308041	0
2017-12-14 06:24:18	576	system user	 	 	Connect	1292	Waiting for preceding transaction to commit	 	1291873	0
2017-12-14 06:24:18	577	system user	 	 	Connect	1294	Waiting for an event from Coordinator	 	1293441	0
2017-12-14 06:24:18	574	system user	 	 	Connect	1292	Waiting for preceding transaction to commit	 	1291873	0
2017-12-14 06:24:18	575	system user	 	 	Connect	1292	Waiting for preceding transaction to commit	 	1291873	0
2017-12-14 06:24:18	573	system user	 	 	Connect	1292	Waiting for binlog lock	 	1291704	0
2017-12-14 06:24:18	9412666	backup	localhost	 	Query	1292	Waiting for binlog lock	LOCK BINLOG FOR BACKUP	1291869	0
2017-12-14 06:24:18	571	system user	 	 	Connect	3750333	Waiting for master to send event	 	3750332332	0
2017-12-14 06:24:18	572	system user	 	 	Connect	1292	Waiting for dependent transaction to commit

it seem,the thread that Waiting for preceding transaction to commit had lock binlog。it 
 lead to deadlock while backing up MTS slave

How to repeat:
master variables:binlog_group_commit_sync_delay=1000000 (max)
slave  variables:slave_preserve_commit_order=1
                        slave_parallel_workers=4
                        slave_parallel_type='LOGICAL_CLOCK'
you insert a innodb table and a MyiSaM table.
table:
CREATE TABLE `sbtest1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `pad` char(60) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=40 DEFAULT CHARSET=utf8
CREATE TABLE `sbtest2` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `pad` char(60) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k` (`k`)
) ENGINE=MyISAMAUTO_INCREMENT=40 DEFAULT CHARSET=utf8
slave session 1:lock table for backup;
slave session 2:insert sbtest2(k) values(1),(2),(3),(4),(5),(6);
slave session 3:insert sbtest1(k) values(1);
show processlist can see a thread in Waiting for preceding transaction to commit。
slave session 1:lock binlog for backup;
[22 Dec 2017 10:12] MySQL Verification Team
Hi,

What backup binary are you using? Is this mysql enterprise backup or we are talking about percona/maria backup binary?

Can you reproduce this without MyISAM tables?

thanks
Bogdan
[26 Dec 2017 6:19] lisa lisa
percona backup
[26 Dec 2017 6:25] lisa lisa
it which reproduce this without MyISAM tables is not easy.it happen when replicating  a slow commit in group commit,other commit in group commit must waiting,the backup want lock binlog .
[26 Dec 2017 10:21] MySQL Verification Team
I cannot reproduce this with MySQL Enterprise Backup. As for percona backup app I can't say if it works or not, you have to report a bug to Percona.

all best
Bogdan