Bug #84592 Hang in query end state
Submitted: 21 Jan 2017 0:44 Modified: 14 Apr 2017 8:28
Reporter: Roel Van de Paar Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.7.16 OS:Any
Assigned to: CPU Architecture:Any
Tags: hang

[21 Jan 2017 0:44] Roel Van de Paar
Description:
mysql> SHOW FULL PROCESSLIST;
+----+------+-----------+------+---------+------+-----------+-------------------------------------------------------------------------------------------------------+
| Id | User | Host      | db   | Command | Time | State     | Info                                                                                                  |
+----+------+-----------+------+---------+------+-----------+-------------------------------------------------------------------------------------------------------+
|  2 | root | localhost | test | Query   |   42 | query end | CREATE EVENT a ON SCHEDULE EVERY 30 SECOND DO SET @id:=(SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST) |
|  3 | root | localhost | test | Query   |    0 | starting  | SHOW FULL PROCESSLIST                                                                                 |
+----+------+-----------+------+---------+------+-----------+-------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

[...]

mysql> SHOW FULL PROCESSLIST;
+----+------+-----------+------+---------+------+-----------+-------------------------------------------------------------------------------------------------------+
| Id | User | Host      | db   | Command | Time | State     | Info                                                                                                  |
+----+------+-----------+------+---------+------+-----------+-------------------------------------------------------------------------------------------------------+
|  2 | root | localhost | test | Query   |  192 | query end | CREATE EVENT a ON SCHEDULE EVERY 30 SECOND DO SET @id:=(SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST) |
|  4 | root | localhost | test | Query   |    0 | starting  | SHOW FULL PROCESSLIST                                                                                 |
+----+------+-----------+------+---------+------+-----------+-------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

How to repeat:
1) Start mysqld with these options;

--no-defaults --plugin-load-add=tokudb=ha_tokudb.so --maximum-sort_buffer_size=1M --maximum-tmp_table_size=1M --maximum-transaction_alloc_block_size=1M --maximum-transaction_prealloc_size=1M --log-output=none --sql_mode=ONLY_FULL_GROUP_BY --log-bin --server-id=0 --binlog-group-commit-sync-delay=2

2) Execute the following at the command line (mysql>)

CREATE EVENT a ON SCHEDULE EVERY 30 SECOND DO SET @id:=(SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST);

The 30 seconds is arbitrary/not specific; it can be any value. CREATE EVENT may or may not be related. Note event scheduler (--event-scheduler) is not active/used! It could be that queries will trigger the same hang.

3) mysql client will hang
[21 Jan 2017 0:55] Roel Van de Paar
The plugin-load-add is not required (and not available in MySQL server)
[21 Jan 2017 0:57] Roel Van de Paar
http://dev.mysql.com/doc/refman/5.7/en/replication-options-binary-log.html#sysvar_binlog_g...
[21 Jan 2017 1:06] Roel Van de Paar
DROP DATABASE test;

As a query works just as well;

mysql> show full processlist;
+----+------+-----------+------+---------+------+-----------+-----------------------+
| Id | User | Host      | db   | Command | Time | State     | Info                  |
+----+------+-----------+------+---------+------+-----------+-----------------------+
|  2 | root | localhost | NULL | Query   |   43 | query end | DROP DATABASE test    |
|  5 | root | localhost | NULL | Query   |    0 | starting  | show full processlist |
+----+------+-----------+------+---------+------+-----------+-----------------------+
2 rows in set (0.00 sec)
[21 Jan 2017 1:07] Roel Van de Paar
mysql> show full processlist;
+----+------+-----------+------+---------+------+-----------+-----------------------+
| Id | User | Host      | db   | Command | Time | State     | Info                  |
+----+------+-----------+------+---------+------+-----------+-----------------------+
|  2 | root | localhost | NULL | Query   |  427 | query end | DROP DATABASE test    |
|  5 | root | localhost | NULL | Query   |    0 | starting  | show full processlist |
+----+------+-----------+------+---------+------+-----------+-----------------------+
2 rows in set (0.00 sec)
[21 Jan 2017 1:55] Roel Van de Paar
--no-defaults --log-bin --server-id=0 --binlog-group-commit-sync-delay=1

and 

DROP DATABASE test;

Is the minimal testcase.
[21 Jan 2017 7:56] MySQL Verification Team
Hello Roel,

Thank you for the report.
I confirmed reported issue with 5.7.16 but it is not repeatable with 5.7.17 build. Could you please confirm with 5.7.17?

-- 5.7.16

[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.7.16: bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.16-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

root@localhost [(none)]> create database test;

^C^C -- query aborted
^C^C -- query aborted
^Z
[2]+  Stopped                 bin/mysql -uroot -S /tmp/mysql_ushastry.sock

-- 5.7.17

[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.7.17: bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

root@localhost [(none)]>  create database test;
Query OK, 1 row affected (0.00 sec)

root@localhost [(none)]> show processlist;
+----+------+-----------+------+---------+------+----------+------------------+
| Id | User | Host      | db   | Command | Time | State    | Info             |
+----+------+-----------+------+---------+------+----------+------------------+
|  3 | root | localhost | NULL | Query   |    0 | starting | show processlist |
+----+------+-----------+------+---------+------+----------+------------------+
1 row in set (0.00 sec)

Thanks,
Umesh
[23 Jan 2017 13:27] MySQL Verification Team
https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-17.html

"Replication: When binlog_group_commit_sync_delay was set to a value between 1 and 9, if binlog_group_commit_sync_no_delay_count was set to a value greater than 1, and the number of transaction commits was less than binlog_group_commit_sync_no_delay_count, these commits hung forever if no more commits were received; and if binlog_group_commit_sync_no_delay_count was set to 0, all transaction commits hung forever. (Bug #80652, Bug #22891628)"
[31 Jan 2017 19:10] Roel Van de Paar
Thanks Shane. Please set this bug to duplicate.
[6 Apr 2017 11:11] Aaron Greenspan
I have just upgraded from MySQL 5.5 to 5.7.17, and suddenly extremely simple queries such as...

UPDATE LOW_PRIORITY `dockets` SET `hits`=`hits`+1, `timestamp`=`timestamp` WHERE `docketid`='9088741' LIMIT 1

...that affect one field in one row by adding 1 take about a minute to execute. And about 100 of these are running at any given time, clogging up the server. They tend to get stuck in the "query end" state.

I'm also seeing error log messages such as:

2017-04-06T10:46:15.382240Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 1239860ms. The settings might not be optimal. (flushed=201 and evicted=0, during the time.)
2017-04-06T11:01:09.878822Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 59888ms. The settings might not be optimal. (flushed=201 and evicted=0, during the time.)
2017-04-06T11:02:29.721566Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 60891ms. The settings might not be optimal. (flushed=201 and evicted=0, during the time.)

The settings started off exactly the same as they were in MySQL 5.5, and even after trying to further optimize them, I've had no luck.

I'm not familiar enough with MySQL's internals to debug this in depth, but this is a serious problem for my server (and site).
[14 Apr 2017 8:28] Roel Van de Paar
Aaron, your comment is not related to this ticket.
[7 Dec 2017 15:28] Martin Arrieta
I have a similar issue on 5.7.18. 

Attaching some information:

mysql> \s
--------------
mysql  Ver 14.14 Distrib 5.7.18, for Linux (x86_64) using  EditLine wrapper

Connection id:          936226804
Current database:       db1
Current user:           user@localhost
SSL:                    Not in use
Current pager:          grep "query end"
Using outfile:          ''
Using delimiter:        ;
Server version:         5.7.18-log MySQL Community Server (GPL)
Protocol version:       10
Connection:             Localhost via UNIX socket
Server characterset:    utf8
Db     characterset:    utf8
Client characterset:    utf8
Conn.  characterset:    utf8
UNIX socket:            /var/lib/mysql/mysql.sock
Uptime:                 85 days 10 hours 25 min 53 sec

Threads: 68  Questions: 53562107154  Slow queries: 548680  Opens: 1247090  Flush tables: 1  Open tables: 4096  Queries per second avg: 7256.211
--------------

mysql> pager grep "query end"
PAGER set to 'grep "query end"'
mysql> show processlist;
| 111419382 | user1 | 192...... | db1 | Query  | 6346207 | query end | SELECT `table1`               |
| 235409559 | user1 | 10.1..... | db1 | Killed | 5280385 | query end | SELECT `table1`               |
| 270337879 | user1 | 192...... | db1 | Killed | 5003621 | query end | SELECT * FROM                 |
| 427280680 | user1 | 10.1..... | db1 | Killed | 3832668 | query end | SELECT *                      |
| 459581197 | user1 | 192...... | db1 | Killed | 3624365 | query end | SELECT * FROM category        |
| 464837749 | user1 | 10.1..... | db1 | Killed | 3587469 | query end | SELECT `table1`               |
| 642524118 | user1 | 192...... | db1 | Query  | 2115045 | query end | SELECT id,                    |
| 705051730 | user1 | 192...... | db1 | Query  | 1542095 | query end | SELECT *                      |
| 719941394 | user1 | 10.1..... | db1 | Query  | 1421614 | query end | SELECT `table1`               |
| 829353249 | user2 | 192...... | db1 | Query  |  866291 | query end | SELECT field1, field2, field3 |

mysql> show global variables like 'binlog_group_%';
+-----------------------------------------+-------+
| Variable_name                           | Value |
+-----------------------------------------+-------+
| binlog_group_commit_sync_delay          | 0     |
| binlog_group_commit_sync_no_delay_count | 0     |
+-----------------------------------------+-------+
2 rows in set (0.00 sec)

mysql> show global variables like '%bin%';
+-----------------------------------------+-----------------------------------+
| Variable_name                           | Value                             |
+-----------------------------------------+-----------------------------------+
| bind_address                            | 192.168.1.19                      |
| binlog_cache_size                       | 32768                             |
| binlog_checksum                         | CRC32                             |
| binlog_direct_non_transactional_updates | OFF                               |
| binlog_error_action                     | ABORT_SERVER                      |
| binlog_format                           | MIXED                             |
| binlog_group_commit_sync_delay          | 0                                 |
| binlog_group_commit_sync_no_delay_count | 0                                 |
| binlog_gtid_simple_recovery             | ON                                |
| binlog_max_flush_queue_time             | 0                                 |
| binlog_order_commits                    | ON                                |
| binlog_row_image                        | FULL                              |
| binlog_rows_query_log_events            | OFF                               |
| binlog_stmt_cache_size                  | 32768                             |
| innodb_api_enable_binlog                | OFF                               |
| innodb_locks_unsafe_for_binlog          | OFF                               |
| log_bin                                 | ON                                |
| log_bin_basename                        | /var/lib/mysqllogs/bin-log        |
| log_bin_index                           | /var/lib/mysqllogs/bin-log.index  |
| log_bin_trust_function_creators         | OFF                               |
| log_bin_use_v1_row_events               | OFF                               |
| log_statements_unsafe_for_binlog        | ON                                |
| max_binlog_cache_size                   | 18446744073709547520              |
| max_binlog_size                         | 1073741824                        |
| max_binlog_stmt_cache_size              | 18446744073709547520              |
| sync_binlog                             | 0                                 |
+-----------------------------------------+-----------------------------------+
26 rows in set (0.00 sec)