Bug #100112 Improve MY-013417 error to include username, host:port or socket and wait settin
Submitted: 4 Jul 2020 16:49 Modified: 9 Feb 2021 16:22
Reporter: Simon Mudd (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Logging Severity:S4 (Feature request)
Version:8.0.20 OS:Any
Assigned to: CPU Architecture:Any
Tags: improve logging, wait_timeout

[4 Jul 2020 16:49] Simon Mudd
Description:
I noticed some logging on 8.0.20 (current latest GA version):

2020-07-04T01:51:27.722184Z 1759865 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T02:41:24.597967Z 1784064 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T02:41:24.607277Z 1784063 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T02:41:24.623014Z 1784062 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T02:41:24.918346Z 1784066 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T02:41:24.918964Z 1784065 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T02:41:24.930946Z 1784067 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T02:51:27.778690Z 1789041 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T03:51:27.765479Z 1817078 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T04:51:27.782954Z 1844806 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T05:51:27.778717Z 1873709 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T06:51:27.616311Z 1901958 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T07:51:27.679497Z 1931050 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T08:51:27.807794Z 1959014 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T09:51:27.812973Z 1987022 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T10:51:27.808995Z 2015025 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T11:42:43.962762Z 0 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T11:51:27.718695Z 2043212 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T12:24:39.802516Z 0 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T12:51:27.812983Z 2070566 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T13:29:18.143102Z 0 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T13:38:05.351251Z 0 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T13:51:27.749967Z 2098577 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T13:54:48.472934Z 0 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T14:51:27.720980Z 2126725 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T15:00:06.729568Z 0 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T15:51:27.804901Z 2155848 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T16:12:19.353699Z 0 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T16:12:19.357967Z 0 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.
2020-07-04T16:12:19.364931Z 0 [Note] [MY-013417] [Server] The wait_timeout period was exceeded, the idle time since last command was too long.

This clearly indicates a problem, though it may not be critical.

However, the error misses important information such as indicating the username / host:port or socket where this happened (and maybe the wait_timeout value as it would be the session value not necessarily the global value.

This logging is with: log_error_verbosity = 3

How to repeat:
I guess gets some apps to connect to the backend database and fail to communicate within the [session] wait_timeout sessions.

Suggested fix:
Improve the error message to include client information:
* username
* hostname:port or socket path
* session wait_timeout setting of the session that failed.

Suggested replacement message would be something like:

2020-07-04T01:51:27.722184Z 1759865 [Note] [MY-013417] [Server] wait_timeout period of 300 seconds was exceeded for user@10.20.30.40:12345, the idle time since last command was too long.

This would allow me to identify the client which is connecting and potentially the application that is triggering this from the logging.
[4 Jul 2020 16:52] Simon Mudd
I assume the 0 values are server internal threads and the others are processlist id column values.  So it's not too clear what these 0 values exactly are, but guess they could be the dump threads as the server I saw this on was a master.
[4 Jul 2020 18:03] Frederic Descamps
Hi Simon, 

Meanwhile, I case you want to know which accounts are not closing their connection properly, you can use the following query:

SELECT ess.user, ess.host,
    (a.total_connections - a.current_connections) - ess.count_star as not_closed,
    ((a.total_connections - a.current_connections) - ess.count_star) * 100 /
    (a.total_connections - a.current_connections) as pct_not_closed
FROM performance_schema.events_statements_summary_by_account_by_event_name ess
JOIN performance_schema.accounts a on (ess.user = a.user and ess.host = a.host)
WHERE ess.event_name = 'statement/com/quit'
  AND (a.total_connections - a.current_connections) > ess.count_star;
[6 Jul 2020 5:53] MySQL Verification Team
Hello Simon,

Thank you for the reasonable feature request!

regards,
Umesh
[9 Feb 2021 16:22] Paul DuBois
Posted by developer:
 
Fixed in 8.0.25.

Information written to the server error log for client timeouts now
includes (if available) the timeout value, and client user and host.
[24 Apr 2021 14:21] Paul DuBois
Posted by developer:
 
Fixed in 8.0.26, not 8.0.25.