Bug #89042 | START GR HANGS AND WITH THAT NO NEW CONNECTION IS ALLOWED ON THAT SERVER | ||
---|---|---|---|
Submitted: | 23 Dec 2017 6:47 | Modified: | 22 Feb 2018 11:05 |
Reporter: | Narendra Singh Chauhan | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Group Replication | Severity: | S2 (Serious) |
Version: | 8.0.5 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[23 Dec 2017 6:47]
Narendra Singh Chauhan
[3 Jan 2018 17:22]
Anibal Pinto
Posted by developer: On rpl_group_replication.cc, function group_replication_start, we have: 103 In order to prevent a concurrent client from executing SET 104 GTID_MODE=ON_PERMISSIVE between 1 and 2, we must hold 105 gtid_mode_lock. 106 */ 107 gtid_mode_lock->rdlock(); When executing the statement SET GLOBAL GTID_MODE=ON it will block, it will wait to lock gtid_mode_lock, a part of the stacktrace: #4 Sys_var_gtid_mode::global_update at sql/sys_vars.cc:4007 #5 sys_var::update at sql/set_var.cc:254 On sys_vars.cc we have: 4007 gtid_mode_lock->wrlock(); Already locked on group_replication_start. As a consequence, SET GLOBAL read_only= 0 blocks on sql/set_var.cc: 252 AutoWLock lock1(&PLock_global_system_variables); Due already being locked by statement SET GLOBAL GTID_MODE=ON as can be seen on the stacktrace. The best approach seems to change SET GLOBAL GTID_MODE=ON to return an error if is not able to obtain the lock.
[22 Feb 2018 11:05]
David Moss
Posted by developer: Thank you for your feedback, this has been fixed in upcoming versions and the following was added to the 8.0.11 changelog: After issuing START GROUP_REPLICATION the GTID_MODE is locked to prevent any modification to its value until the group is online. Any attempt to try and change GTID_MODE during this time is blocked. As part of the process of starting Group Replication the server needs to set super_read_only=off, which has dependencies on locks acquired by SET GTID_MODE. This could result in Group Replication hanging and there was no possibility to connect to the server to resolve the situation. To prevent this situation, when it is not possible to acquire the locks needed by SET GTID_MODE the operation aborts.
[31 May 2018 14:53]
David Moss
Reclosing.