Bug #86691 all threads are hanged in group replication
Submitted: 14 Jun 2017 7:38 Modified: 28 Jul 2017 8:13
Reporter: zte zte Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.7.18 OS:CentOS
Assigned to: MySQL Verification Team CPU Architecture:Any
Tags: group replication ,thread hanged

[14 Jun 2017 7:38] zte zte
Description:
phenomenon
mysql 5.7.18 group replication,100 threads insert data on the master node. Then,all threads are hanged. All data is not inserted after then.

stack information shown by gdb
#0  0x000000376500b5bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007eff487b552e in Wait_ticket<unsigned int>::waitTicket(unsigned int const&) () from /home/rdb/lib/plugin/group_replication.so

show processlist\G
| 17711 | root        | centos177:56649 | per  | Query   |  4216 | starting                                               | commit           |
| 17712 | root        | centos177:56651 | per  | Query   |  4214 | starting                                               | commit           |

show engine innodb ststus\G
MySQL thread id 17802, OS thread handle 139636232890112, query id 3138503 centos177 10.46.178.177 root starting
commit
---TRANSACTION 4516, ACTIVE (PREPARED) 4243 sec
1 lock struct(s), heap size 1136, 0 row lock(s), undo log entries 1000
MySQL thread id 17799, OS thread handle 139635293886208, query id 3138359 centos177 10.46.178.177 root starting
commit
---TRANSACTION 4515, ACTIVE (PREPARED) 4243 sec
1 lock struct(s), heap size 1136, 0 row lock(s), undo log entries 1000
MySQL thread id 17798, OS thread handle 139634896103168, query id 3138407 centos177 10.46.178.177 root starting
commit

How to repeat:
I don't know

Suggested fix:
dead lock test,and roll back those transactions.
[14 Jun 2017 17:34] MySQL Verification Team
Hi,

Were all 100 threads locked "indefinitely" or they followed trough after timeout with aborted transaction?

Can you share your config?

best regards
Bogdan
[19 Jun 2017 8:47] Nuno Carvalho
Hi,

Can you please check that you are facing a network partition?
https://dev.mysql.com/doc/refman/5.7/en/group-replication-detecting-partitions.html

On a future version, we will provide a mechanism to deal with this automatically.
Please check BUG#84727: GR: Partitioned Node Should Get Updated Status and not accept writes.

Best regards,
Nuno Carvalho
[28 Jun 2017 7:27] zte zte
3 Nodes are all online。
[29 Jul 2017 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".