Bug #108945 Killed connection will reconnect and retry query
Submitted: 31 Oct 2022 19:12 Modified: 8 Dec 2022 19:53
Reporter: Marc Reilly Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Command-line Clients Severity:S3 (Non-critical)
Version:8.0.31 OS:Any
Assigned to: CPU Architecture:Any
Tags: KILL, mysql client

[31 Oct 2022 19:12] Marc Reilly
Description:
By default, when kill connection is called the MySQL client will automatically reconnect. This is expected behavior when `--skip-reconnect` is not used.

However, based on my investigation when the MySQL client reconnects it will attempt to re-execute the query running on the killed thread

https://dev.mysql.com/doc/refman/8.0/en/kill.html
https://dev.mysql.com/doc/refman/8.0/en/mysql-command-options.html#option_mysql_reconnect

How to repeat:
load data into 2 sysbench tables, this will be used to simulate a long running query:

../sysbench ~/sysbench/src/lua/oltp_read_write.lua --db-driver=mysql --mysql-user=root --mysql-password=msandbox --mysql-db=sysbench --threads=2 --rand-type=uniform --db-ps-mode=disable --tables=2 --table-size=20000000 --mysql-socket=/tmp/mysql_sandbox8031.sock --mysql-port=8031 --time=0 --report-interval=1 prepare

Session 1:
Get connection ID and Execute long running query. You will see each corresponding KILL the client will reconnect as expected. However, the client also attempts to re-execute the query.
mysql [localhost:8026] {msandbox} (sysbench) > select * from sbtest1 a,sbtest2 b where b.pad=a.pad for update;
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id:    28
Current database: sysbench

ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id:    29
Current database: sysbench

ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql [localhost:8026] {msandbox} (sysbench) >

Session 2:
After executing kill 3 times you will see the query finally stops retrying:
mysql [localhost:8026] {msandbox} (sysbench) > kill 27;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost:8026] {msandbox} (sysbench) > show processlist;
+----+-----------------+-----------+----------+---------+------+------------------------+----------------------------------------------------------------+
| Id | User            | Host      | db       | Command | Time | State                  | Info                                                           |
+----+-----------------+-----------+----------+---------+------+------------------------+----------------------------------------------------------------+
|  5 | event_scheduler | localhost | NULL     | Daemon  | 5944 | Waiting on empty queue | NULL                                                           |
| 23 | msandbox        | localhost | sysbench | Query   |    0 | init                   | show processlist                                               |
| 28 | msandbox        | localhost | sysbench | Query   |    5 | executing              | select * from sbtest1 a,sbtest2 b where b.pad=a.pad for update |
+----+-----------------+-----------+----------+---------+------+------------------------+----------------------------------------------------------------+
3 rows in set (0.00 sec)

mysql [localhost:8026] {msandbox} (sysbench) > kill 28;
Query OK, 0 rows affected (0.03 sec)

mysql [localhost:8026] {msandbox} (sysbench) > show processlist;
+----+-----------------+-----------+----------+---------+------+------------------------+----------------------------------------------------------------+
| Id | User            | Host      | db       | Command | Time | State                  | Info                                                           |
+----+-----------------+-----------+----------+---------+------+------------------------+----------------------------------------------------------------+
|  5 | event_scheduler | localhost | NULL     | Daemon  | 5950 | Waiting on empty queue | NULL                                                           |
| 23 | msandbox        | localhost | sysbench | Query   |    0 | init                   | show processlist                                               |
| 29 | msandbox        | localhost | sysbench | Query   |    1 | executing              | select * from sbtest1 a,sbtest2 b where b.pad=a.pad for update |
+----+-----------------+-----------+----------+---------+------+------------------------+----------------------------------------------------------------+
3 rows in set (0.00 sec)

mysql [localhost:8026] {msandbox} (sysbench) > kill 29;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost:8026] {msandbox} (sysbench) > show processlist;
+----+-----------------+-----------+----------+---------+------+------------------------+------------------+
| Id | User            | Host      | db       | Command | Time | State                  | Info             |
+----+-----------------+-----------+----------+---------+------+------------------------+------------------+
|  5 | event_scheduler | localhost | NULL     | Daemon  | 5960 | Waiting on empty queue | NULL             |
| 23 | msandbox        | localhost | sysbench | Query   |    0 | init                   | show processlist |
+----+-----------------+-----------+----------+---------+------+------------------------+------------------+
2 rows in set (0.00 sec)

When --no-reconnect is used, the client is successfully killed and query does not retry.

Suggested fix:
When KILL is called and client has reconnect enabled, the client should reconnect but not retry the query.
[31 Oct 2022 19:22] Marc Reilly
Note: I verified the same behavior on versions 8026-->8031
[1 Nov 2022 13:23] MySQL Verification Team
Hi Mr. Reilly,

Thank you for your report.

However, what you describe is exactly how our client is designed to work.

If you wish that it behaves differently, then this can be a feature request. However, you are supposed to propose a detailed feature request. This will require a new option and a way of how would it function.

If your proposal is sound and detailed, we could then verify it as a feature request.

Waiting on your feedback.
[1 Nov 2022 16:13] Marc Reilly
Hi,

If this is the new "expected behavior" I think this is a dangerous setting to be enabled by default. If a query is being executed from inside a transaction and that connection is explicitly killed, the query will be re-driven by the client in a NEW transaction. This can lead to incorrect results.

In the below scenario only the insert is re-driven(not the rest of the transaction), in a NEW transaction which will explicitly commit by default. This is only a simple demonstration, in real world scenarios these queries can be followed by, and/or preceded with by other DML.
 
begin;
update sbtest1 set pad='uncommitted data' where id < 1000000;
insert blah blah blah
update blah blah blah
insert into sbtest_test k,c,pad select k,c,pad from sbtest1 where id < 1000000; # <--- Kill connection while this query is executing.
mysql [localhost:8031] {msandbox} (sysbench) > update sbtest1 set pad='uncommitted data' where id < 1000000;
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id:    12
Current database: sysbench

Query OK, 999999 rows affected (25.25 sec)
Rows matched: 999999  Changed: 999999  Warnings: 0

mysql [localhost:8031] {msandbox} (sysbench) > rollback; <--- For demonstration
Query OK, 0 rows affected (0.00 sec)

> select count(*) from sbtest1 where pad='uncommitted data';   <---- We can see data has been committed
+----------+
| count(*) |
+----------+
|   999999 |
+----------+
1 row in set (7.73 sec)

==========================

Also, out of interest- what release did this behavior change in? This only happens for 80 clients from what I can see. Using a 5740 client against a 8031 server explicitly killed connections reconnect but do not retry queries. Is it documented that this behavior changed in 8.0? 

Session 1: 5740 client
$ /opt/mysql/5.7.40/bin/mysql --reconnect
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 16

mysql [localhost:8031] {msandbox} (sysbench) > \s
--------------
/opt/mysql/5.7.40/bin/mysql  Ver 14.14 Distrib 5.7.40, for linux-glibc2.12 (x86_64) using  EditLine wrapper <---- Client version

Connection id:		16
.
.
Server version:		8.0.31 MySQL Community Server - GPL <---- Server version
Protocol version:	10
Connection:		Localhost via UNIX socket
.
.
Threads: 3  Questions: 85  Slow queries: 0  Opens: 695  Flush tables: 4  Open tables: 134  Queries per second avg: 0.001
--------------
# Kill in session 2 kills the connection and 57 client does not re-drive the query:
mysql [localhost:8031] {msandbox} (sysbench) > select * from sbtest1 a,sbtest2 b where b.pad=a.pad for update;
ERROR 2013 (HY000): Lost connection to MySQL server during query

# Next time client runs query, we reconnect as per the `--reconnect` flag
mysql [localhost:8031] {msandbox} (sysbench) > show processlist;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    17
Current database: sysbench

Session 2: 8031 client:
# Kill query executing in session 1:
mysql [localhost:8031] {msandbox} (sysbench) > kill 16;
Query OK, 0 rows affected (0.00 sec)

# From processlist we can see the query is killed successfully.
+----+-----------------+-----------+----------+---------+-------+------------------------+------------------+
| Id | User            | Host      | db       | Command | Time  | State                  | Info             |
+----+-----------------+-----------+----------+---------+-------+------------------------+------------------+
|  7 | event_scheduler | localhost | NULL     | Daemon  | 72432 | Waiting on empty queue | NULL             |
| 14 | msandbox        | localhost | sysbench | Sleep   |    10 |                        | NULL             |
| 17 | msandbox        | localhost | sysbench | Query   |     0 | init                   | show processlist |
+----+-----------------+-----------+----------+---------+-------+------------------------+------------------+
3 rows in set (0.00 sec)

mysql [localhost:8031] {msandbox} (sysbench) >
=============

Thanks!
Marc
[1 Nov 2022 16:28] Marc Reilly
Just a correction to my previous for the explicit transaction example, same result:

begin;
update sbtest1 set pad='uncommitted data 1' where id=1000001;
# As expected we see this change from inside trx
select * from sbtest1 where id=1000001;
+---------+----------+-------------------------------------------------------------------------------------------------------------------------+--------------------+
| id      | k        | c                                                                                                                       | pad                |
+---------+----------+-------------------------------------------------------------------------------------------------------------------------+--------------------+
| 1000001 | 18377069 | 53883811315-66575355910-60942048168-89250940350-15465994604-79656049822-39766943500-96543286246-48830621456-66965790282 | uncommitted data 1 |
+---------+----------+-------------------------------------------------------------------------------------------------------------------------+--------------------+
1 row in set (0.00 sec)

# Run update and kill its parent connection in session 2. MySQL client auto retries the query in a new, autocommit transaction
update sbtest1 set pad='uncommitted data' where id < 1000000;
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id:    15
Current database: sysbench

Query OK, 999999 rows affected (12.43 sec)
Rows matched: 999999  Changed: 999999  Warnings: 0

# For demonstration, this does nothing since its a new autocommit trx
mysql [localhost:8031] {msandbox} (sysbench) > rollback;
Query OK, 0 rows affected (0.00 sec)

# First insert in transaction was rolled back 
mysql [localhost:8031] {msandbox} (sysbench) > select * from sbtest1 where id=1000001;
+---------+----------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| id      | k        | c                                                                                                                       | pad                                                         |
+---------+----------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| 1000001 | 18377069 | 53883811315-66575355910-60942048168-89250940350-15465994604-79656049822-39766943500-96543286246-48830621456-66965790282 | 32895769618-45868072163-98451269245-23357398240-47063479192 |
+---------+----------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
1 row in set (0.00 sec)

# Re-driven update auto commited in a new transaction:
mysql [localhost:8031] {msandbox} (sysbench) > select count(*) from sbtest1 where pad='uncommitted data';
+----------+
| count(*) |
+----------+
|   999999 |
+----------+
1 row in set (7.97 sec)

Session 2:
mysql [localhost:8031] {msandbox} (sysbench) > select count(*) from sbtest1 where pad='uncommitted data';
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (7.50 sec)

mysql [localhost:8031] {msandbox} (sysbench) > kill 14;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost:8031] {msandbox} (sysbench) > select count(*) from sbtest1 where pad='uncommitted data';
+----------+
| count(*) |
+----------+
|   999999 |
+----------+
1 row in set (8.02 sec)
[2 Nov 2022 13:01] MySQL Verification Team
HI Mr. Reilly,

We fully understand what you are writing about.

However, millions of our customers and users rely on this feature, which is fully documented in our Reference Manual.

Hence, what can be done is adding an option that would disable this feature. That would solve your problem, but that is a feature request.

If you wish to proceed, define precisely the entire new feature and how would you like it to behave and we shall be happy to verify it.
[8 Dec 2022 19:53] Marc Reilly
Hi,

In the documentation --reconnect is defined as the following which is incorrect for MySQL 8. In MySQL 8 clients, they will reconnect(as per definition) but also retry in-flight queries - reconnect != reconnect and retry query. :
>If the connection to the server is lost, automatically try to reconnect. A single reconnect attempt is made each time the connection is lost. To suppress reconnection behavior, use --skip-reconnect.
https://dev.mysql.com/doc/refman/8.0/en/mysql-command-options.html#option_mysql_reconnect

Feature requests:
1) Update documentation to note that `--reconnect` is not just reconnect for MySQL 8 clients, it is also retry inflight query. For non-mysql 8 clients, the above, existing definition is correct.
2) Reconnect and retry should be configurable independently using a new client knob or a new option for the current knob. A connection which is disconnected should not automatically re-execute a query as in some cases it is dangerous, especially when using explicit transaction blocks since it can lead to data inconsistencies(see above example).

Thanks,
Marc
[9 Dec 2022 12:31] MySQL Verification Team
Hi Mr. Reilly,

First of all, documentation request is not a feature request. It is something separate. However, it is already fully documented in the chapter "Disabling mysql Auto-Reconnect".

Next, regarding MySQL Client, we do not have knobs in our C API, nor in MySQL CLI, nor in MySQL Shell, so that feature request is impossible to implement.

Last , but not least, our documentation clearly directs all those that do not like the retry of the last fail statements, to disable auto-reconnect feature !!!

Hence, this is truly "Not a bug".