Bug #87235 | "STOP GROUP_REPLICATION" cause deadlock | ||
---|---|---|---|
Submitted: | 28 Jul 2017 7:33 | Modified: | 26 Sep 2017 14:23 |
Reporter: | Seunguck Lee | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: Group Replication | Severity: | S2 (Serious) |
Version: | 5.7.19 | OS: | CentOS (Linux localhost 3.10.0-514.26.2.el7.x86_64 #1 SMP Tue Jul 4 15:04:05 UTC 2017 x86) |
Assigned to: | CPU Architecture: | Any |
[28 Jul 2017 7:33]
Seunguck Lee
[28 Jul 2017 7:34]
Seunguck Lee
Stack trace
Attachment: GroupReplication-Deadlock.txt (text/plain), 61.58 KiB.
[2 Aug 2017 1:46]
jianxin chen
meet the same situation : - single primary mode - stop group replication on primary node - schedule an event to insert row into one table - dead lock occur processlist: mysql> select * from information_schema.processlist order by time desc; +------+-----------------+-----------+------+---------+------+------------------------------+-----------------------------------------------------------------+ | ID | USER | HOST | DB | COMMAND | TIME | STATE | INFO | +------+-----------------+-----------+------+---------+------+------------------------------+-----------------------------------------------------------------+ | 5283 | root | localhost | NULL | Sleep | 1341 | | NULL | | 5316 | root | localhost | NULL | Query | 152 | starting | stop group_replication | | 5657 | root | localhost | test | Connect | 145 | query end | insert into test.t3 values(null) | | 5658 | mysql.session | localhost | NULL | Query | 144 | Waiting for global read lock | PLUGIN: SET GLOBAL super_read_only= 1 | | 5291 | event_scheduler | localhost | NULL | Daemon | 135 | Waiting for global read lock | NULL | | 5661 | root | localhost | NULL | Query | 0 | executing | select * from information_schema.processlist order by time desc | +------+-----------------+-----------+------+---------+------+------------------------------+-----------------------------------------------------------------+ 6 rows in set (0.00 sec) when I use "kill 5658" to kill the internal query on MGR "PLUGIN: SET GLOBAL super_read_only= 1 ", deadlock disappear...
[2 Aug 2017 1:48]
jianxin chen
pstack when deadlock occur
Attachment: pstack.log (application/octet-stream, text), 24.43 KiB.
[2 Aug 2017 1:56]
jianxin chen
what's worse, when kill the session of "set global super_read_only = 1", data inconsistency occur when I try to rejoin this node back to the group
[21 Aug 2017 18:23]
MySQL Verification Team
Hi, Thanks for the report. I cannot reliably reproduce this issue but I did manage to reproduce it few times. Looks like the freeze happens when doing stop replication while few sessions are doing insert and few sessions are doing select. That is a "normal" state of any db system (sessions running read/write), so I was not able to reliably find a 100% reproduction case. If any of you is able to offer an easy to reproduce test case I'd very much appreciate (that would for sure speed up fixing of this bug). Thanks Bogdan
[22 Aug 2017 1:13]
jianxin chen
Hi Bogdan Kecman, maybe you can use sysbench transaction write test to reproduce this situation.
[26 Sep 2017 11:37]
MySQL Verification Team
not reproducible any more with 5.7.20
[26 Sep 2017 14:23]
Anibal Pinto
Posted by developer: This bug was fixed on wl#10960 Backport wl#10611 to MySQL 5.7.20.