Bug #91055 commit will hang if binlog_group_commit_sync_delay is not a multiple of 10
Submitted: 29 May 2018 6:18 Modified: 15 Jun 2018 13:48
Reporter: Yan Huang (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Options Severity:S2 (Serious)
Version:5.7.21,5.7.19, 5.7.22 OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution

[29 May 2018 6:18] Yan Huang
Description:
commit will hang if binlog_group_commit_sync_delay is not a multiple of 10

How to repeat:
1. prepare a mysqld with binlog enable
2. execute:
> set global binlog_group_commit_sync_delay=32;
> set global binlog_group_commit_sync_no_delay_count=0;
> create table test.t(a int); //hang

Suggested fix:
In `Stage_manager::wait_count_or_timeout`, `to_wait` may be changed from ulong to long, it will not exceed LONG_MAX (binlog_group_commit_sync_delay maxium is 1000000)
[29 May 2018 6:19] Yan Huang
patch

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 91055.diff (application/octet-stream, text), 442 bytes.

[29 May 2018 7:38] MySQL Verification Team
Hello Yan Huang,

Thank you for the report and contribution.

Thanks,
Umesh
[15 Jun 2018 13:48] Margaret Fisher
Posted by developer:
 
Changelog entry added for MySQL 5.7.24 and 8.0.13:

When the binlog_group_commit_sync_delay system variable is set to a wait time to delay synchronization of transactions to disk, and the binlog_group_commit_sync_no_delay_count  system variable is also set to a number of transactions, the MySQL server exits the wait procedure if the specified number of transactions is reached before the specified wait time is reached. The server manages this process by checking on the transaction count after a delta of one tenth of the time specified by binlog_group_commit_sync_delay has elapsed, then subtracting that interval from the remaining wait time. 

If rounding during calculation of the delta meant that the wait time was not a multiple of the delta, the final subtraction of the delta from the remaining wait time would cause the value to be negative, and therefore to wrap to the maximum wait time, making the commit hang. The data type for the remaining wait time has now been changed so that the value does not wrap in this situation, and the commit can proceed when the original wait time has elapsed. Thanks to Yan Huang for the contribution.