MySQL Bugs: #96942: net_read_timeout and net_write

Bug #96942	net_read_timeout and net_write_timeout do not work
Submitted:	19 Sep 2019 9:03	Modified:	30 Sep 2019 12:51
Reporter:	Iwo P	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Connection Handling	Severity:	S1 (Critical)
Version:	5.7. 8.0, 8.0.17	OS:	Linux
Assigned to:		CPU Architecture:	Any

Description:
The net_read_timeout variable is defined as below:

The number of seconds to wait for more data from a connection before aborting the read. When the server is reading from the client, net_read_timeout is the timeout value controlling when to abort. 

However, MySQL does not abort connections if the read (waiting for data) takes more than the value specified in this variable.

I didn't test any other Operating System than Linux.

How to repeat:
1. Prepare a large query:

$ echo "INSERT INTO b SET a = '" > query;
$ for i in $(seq 1 1024); do echo -ne 'a' >> 1kb; done;
$ for i in $(seq 1 1024); do cat 1kb >> 1mb; done;
$ for i in $(seq 1 512); do cat 1mb >> query; done;
$ echo "';" >> query

2. Set max_allowed_packet and net_read_timeout on the instance:

mysql> SET GLOBAL max_allowed_packet=1073741824;
mysql> SET GLOBAL net_read_timeout=1;

3. Run the query from outside of the MySQL instance:

$ cat 1gb | mysql -u user -ppassword -h mysql-host --ssl-mode=disabled test

(ssl-mode=disabled is not required, but helps a lot in analyzing the traffic with tcpdump)

4. While the query is running, interrupt the connection (ungracefully), for instance:

(On the instance the query is running):
iptables -I OUTPUT -p tcp --dport 3306 -j DROP && iptables -I INPUT -p tcp --sport 3306 -j DROP

(On the instance the query is running):
echo b > /proc/sysrq-trigger

Note, it will *immediately* reboot the instance. Do that only on the test, recreatable instances.

5. MySQL should abort the connection after 1 seconds (accoriding to the value of net_read_time), but instead, it will wait for the kernel to terminate the connection based on the settings of:
net.ipv4.tcp_keepalive_intvl
net.ipv4.tcp_keepalive_probes
net.ipv4.tcp_keepalive_time

Suggested fix:
Connections that writes data to MySQL (the one that MySQL reads) should be aborted after the time defined in net_read_timeout.

The above also applies to net_write_timeout and master-slave replication.

Steps to reproduce:
1. Create a Master-Slave replication with net_write_timeout set to 1 and net_retry_count set to 1 on master.
2. Block communication (on slave) with
iptables -I INPUT -ptcp --sport 3306 -j DROP; iptables -I OUTPUT -ptcp --dport 3306 -j DROP
3. Create an event on master (create database something;)

According to documentation `If sufficient time elapses on the master side without activity on the Binlog Dump thread, the master determines that the slave is no longer connected. As for any other client connection, the timeouts for this depend on the values of net_write_timeout and net_retry_count'.

However, the slave is determined as dead/unconnected after Operating System decides to close the socket.

I'm not creating a new bug report for this, as the underlying issue is the same here.

Hello Iwo P,

Thank you for the report and test case.
Verified as described with 8.0.17 build.

regards,
Umesh

Please note net_read_timeout is timeout expiry when the server waits on data (socket fd) to become available. The above test case doesn't simulate such a scenario.

Please note net_read_timeout is timeout expiry when the server waits on data (socket fd) to become available. The above test case doesn't simulate such a scenario.