| Bug #80652 | binary group commit hangs when binlog_group_commit_sync_delay is set to 1~9 | ||
|---|---|---|---|
| Submitted: | 8 Mar 2016 5:36 | Modified: | 15 Dec 2016 12:32 |
| Reporter: | Yang Lyn | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
| Version: | 5.7.5~5.7.11, 5.7.11 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | binlog_group_commit_sync_delay, Group Commit, hang | ||
[8 Mar 2016 9:13]
MySQL Verification Team
Hello Yang Lyn, Thank you for the report and test case. Verified as described with 5.7.11 build. Thanks, Umesh
[25 May 2016 16:59]
MySQL Verification Team
I hit this today on 5.7.12. The thread stacks during hang.. mysqld!Stage_manager::wait_count_or_timeout+0x68 [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\binlog.cc @ 1943] mysqld!MYSQL_BIN_LOG::ordered_commit+0x277 [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\binlog.cc @ 8851] mysqld!MYSQL_BIN_LOG::commit+0x4e2 [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\binlog.cc @ 8066] mysqld!ha_commit_trans+0x1b5 [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\handler.cc @ 1739] mysqld!trans_commit_stmt+0x3a [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\transaction.cc @ 459] mysqld!mysql_execute_command+0x392a [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\sql_parse.cc @ 4958] mysqld!mysql_parse+0x25a [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\sql_parse.cc @ 5525] mysqld!dispatch_command+0x63d [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\sql_parse.cc @ 1432] mysqld!do_command+0x1ea [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\sql_parse.cc @ 999] --------- ntdll!ZwWaitForKeyedEvent+0xa ntdll!RtlSleepConditionVariableCS+0xd4 kernel32!SleepConditionVariableCS+0x28 mysqld!Stage_manager::enroll_for+0x15e [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\binlog.cc @ 1894] mysqld!MYSQL_BIN_LOG::change_stage+0x21 [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\binlog.cc @ 8381] mysqld!MYSQL_BIN_LOG::ordered_commit+0x250 [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\binlog.cc @ 8839] mysqld!MYSQL_BIN_LOG::commit+0x4e2 [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\binlog.cc @ 8066] mysqld!ha_commit_trans+0x1b5 [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\handler.cc @ 1739] mysqld!trans_commit_stmt+0x3a [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\transaction.cc @ 459] mysqld!mysql_execute_command+0x392a [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\sql_parse.cc @ 4958] mysqld!mysql_parse+0x25a [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\sql_parse.cc @ 5525] mysqld!dispatch_command+0x63d [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\sql_parse.cc @ 1432] mysqld!do_command+0x1ea [g:\ade\build\sb_0-18410765-1459191343.73\mysql-5.7.12\sql\sql_parse.cc @ 999] mysql> show processlist; +-----+------+------+------+-------------+------+---------------------------------------------------------------+---------------------------- | Id | User | Host | db | Command | Time | State | Info +-----+------+------+------+-------------+------+---------------------------------------------------------------+---------------------------- | 491 | root | | test | Query | 379 | query end | delete from t limit 1 | 500 | root | | test | Query | 379 | query end | delete from s limit 1 | 503 | root | | test | Query | 379 | query end | insert ignore into `test`.` | 506 | ODBC | | test | Query | 0 | starting | show processlist | 515 | root | | NULL | Binlog Dump | 16 | Master has sent all binlog to slave; waiting for more updates | NULL +-----+------+------+------+-------------+------+---------------------------------------------------------------+----------------------------
[15 Dec 2016 12:32]
Erlend Dahl
[9 Dec 2016 6:30] David Moss Thank you for your feedback, this has been fixed in upcoming versions and the following was added to the 5.7.17 and 8.0.1 changelog: When binlog_group_commit_sync_delay was set to a value between 1 and 9, if binlog_group_commit_sync_no_delay_count was set to a value greater than 1, and the number of transaction commits was less than binlog_group_commit_sync_no_delay_count, these commits hung forever if no more commits were received; and if binlog_group_commit_sync_no_delay_count was set to 0, all transaction commits hung forever.

Description: When binlog_group_commit_sync_delay is set to 1~9: 1) if binlog_group_commit_sync_no_delay_count > 1, and the number of transaction commits is less than binlog_group_commit_sync_no_delay_count, these commits will hang forever if no more commits come in; 2) if binlog_group_commit_sync_no_delay_count = 0, all transaction commits will hang forever; How to repeat: CREATE TABLE t( id INT PRIMARY KEY, v INT NOT NULL) ENGINE INNODB; INSERT INTO t VALUES(1, 100), (2, 100), (3, 100), (4, 100); ##################### begin of case 1 ###################### SET GLOBAL binlog_group_commit_sync_delay = 8; SET GLOBAL binlog_group_commit_sync_no_delay_count = 2; # session 1 BEGIN; UPDATE t set v = v + 100 WHERE id = 1; COMMIT; # hangs here forever if no more commits come in # session 2: execute after session 1 hanging for a while BEGIN; UPDATE t set v = v + 100 WHERE id = 2; COMMIT; # both session 2 and session 1 will commit ###################### end of case 1 ####################### ##################### begin of case 2 ###################### SET GLOBAL binlog_group_commit_sync_delay = 8; SET GLOBAL binlog_group_commit_sync_no_delay_count = 0; # session 1 BEGIN; UPDATE t set v = v + 100 WHERE id = 3; COMMIT; # hangs here forever # session 2 BEGIN; UPDATE t set v = v + 100 WHERE id = 4; COMMIT; # hangs here forever ###################### end of case 2 ####################### Suggested fix: This bug is caused by function Stage_manager::wait_count_or_timeout in sql/binlig.cc; time_t Stage_manager::wait_count_or_timeout(ulong count, time_t usec, StageID stage) { time_t to_wait= DBUG_EVALUATE_IF("bgc_set_infinite_delay", LONG_MAX, usec); /* For testing purposes while waiting for inifinity to arrive, we keep checking the queue size at regular, small intervals. Otherwise, waiting 0.1 * infinite is too long. */ time_t delta= DBUG_EVALUATE_IF("bgc_set_infinite_delay", 100000, static_cast<time_t>(to_wait * 0.1)); while (to_wait > 0 && (count == 0 || static_cast<ulong>(m_queue[stage].get_size()) < count)) { #ifndef DBUG_OFF if (current_thd) DEBUG_SYNC(current_thd, "bgc_wait_count_or_timeout"); #endif my_sleep(delta); to_wait -= delta; } return to_wait; } when binlog_group_commit_sync_delay is set to 1~9, the "delta" will be evaluated to 0, and thus the while loop won't terminate properly. BUG fix suggestion: when binlog_group_commit_sync_delay is set to 1~9, the "delta" should be set to 1 or binlog_group_commit_sync_delay