| Bug #89370 | semi-sync replication doesn't work for minutes after restart replication | ||
|---|---|---|---|
| Submitted: | 24 Jan 2018 6:47 | Modified: | 2 Jul 2018 13:55 |
| Reporter: | 黄 炎-爱可生 (OCA) | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
| Version: | 5.7.16, 5.7.17 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | Contribution | ||
[24 Jan 2018 6:49]
黄 炎-爱可生
patch
Attachment: mysql_89370.diff (application/octet-stream, text), 974 bytes.
[9 Feb 2018 2:45]
miaoxia cao
Anybody respond?
[28 Feb 2018 6:13]
MySQL Verification Team
Hello Yan Huang, Thank you for the report and contribution. Please ensure to re-send the patch via "Contributions" tab. Otherwise we would not be able to accept it. Thanks, Umesh
[28 Feb 2018 6:15]
黄 炎-爱可生
patch (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: mysql_89370.diff (application/octet-stream, text), 974 bytes.
[28 Feb 2018 6:16]
黄 炎-爱可生
Hello Umesh,
Patch is resent to Contributions tab.
Thanks,
Yan Huang
[28 Feb 2018 8:16]
MySQL Verification Team
test results - 5.7.17
Attachment: 89370.results (application/octet-stream, text), 39.90 KiB.
[2 Jul 2018 13:55]
Margaret Fisher
Posted by developer: Changelog entry updated to actual releases 5.7.24 and 8.0.13.

Description: Set up semi-sync replication with 1 master and 2 slaves. After restart replication on one slave, replication replicat no data for minutes. It seems that ack_receiver main loop starves other threads(dump threads) waiting on ack_receiver.m_mutex. Ack_receiver main loop is like: ``` void Ack_receiver::run() { ... while(1) { mysql_mutex_lock(&m_mutex); ... select(...); ... mysql_mutex_unlock(&m_mutex); } ... } ``` How to repeat: 1. Set up semi-sync replication with 1 master and 2 slaves. Keep inserting data on master. 2. On master, `set global rpl_semi_sync_master_wait_for_slave_count=2` 3. On master, check `Rpl_semi_sync_master_status` is "ON" 4. On master, `set global rpl_semi_sync_master_wait_for_slave_count=1` 5. On slave1, `stop slave; start slave;` 6. On slave1, keep checking master status, GTID will not change for minutes Suggested fix: The attached patch add a sleep(1us) every second, making other thread a change to get mutex. ``` void Ack_receiver::run() { ... while(1) { mysql_mutex_lock(&m_mutex); ... select(...); ... mysql_mutex_unlock(&m_mutex); //sleep 1us every second here } ... } ``` It should use monotonic clock, but found no helper function yet. Sleep 1us per second won't impact performance.