Bug #32148 killing a query may be ineffective
Submitted: 6 Nov 2007 17:59 Modified: 18 Dec 2007 4:06
Reporter: Andrei Elkin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.0.38, 5.1 OS:Microsoft Windows
Assigned to: Alexander Nozdrin CPU Architecture:Any
Tags: kill_query
Triage: D2 (Serious)

[6 Nov 2007 17:59] Andrei Elkin
Description:
This snippet of

bool dispatch_command(enum enum_server_command command, THD *thd,
		      char* packet, uint packet_length)
{
  NET *net= &thd->net;
  bool error= 0;
  DBUG_ENTER("dispatch_command");

  if (thd->killed == THD::KILL_QUERY || thd->killed == THD::KILL_BAD_DATA)
  {
    thd->killed= THD::NOT_KILLED;
    thd->mysys_var->abort= 0;
  }

shows thd->killed is reset possibly with the intention to clear up
a potential previous statement's killed status.
However, the connection handle may wipe out the killed status belonging to
the current just have been started statement, not the previous one (!), as `dispatch_command' is fed with
a query by `do_command' reader:

#0  dispatch_command (command=COM_QUERY, thd ...) 
#1  do_command (thd=0x8cb12a0)
#2  handle_one_connection (arg=0x8cb12a0)

not the previous one.

There is evidence it's practically happens.

from:
https://intranet.mysql.com/secure/pushbuild/getlog.pl?dir=mysql-5.0-rpl&entry=aelkin@koti....

binlog_killed                  [ fail ]

--- C:/cygwin/home/pushbuild/pb3/pb/mysql-5.0-rpl/178/mysql-5.0.52-pb178/mysql-test/r/binlog_killed.result	2007-11-06 18:21:07.000000000 +0300
+++ C:\cygwin\home\pushbuild\pb3\pb\mysql-5.0-rpl\178\mysql-5.0.52-pb178\mysql-test\r\binlog_killed.reject	2007-11-06 19:08:57.453125000 +0300
@@ -32,7 +32,6 @@
 select * from t1 order by a /* must be the same as before (1,1),(2,2) */;
 a	b
 1	1
-2	2
 drop table if exists t4;
 create table t4 (a int, b int) engine=innodb;
 insert into t4 values (3, 3);
@@ -46,7 +45,6 @@
 select * from t1 /* must be the same as before (1,1),(2,2) */;
 a	b
 1	1
-2	2

This happens when the following snippet of the test works:

connection con1;
begin; delete from t1 where a=2;

connection con2;
let $ID= `select connection_id()`;
send delete from t1 where a=2;

connection con1;
--replace_result $ID ID
eval kill query $ID;
rollback;

connection con2;
--error 0,ER_QUERY_INTERRUPTED
reap;
select * from t1 order by a /* must be the same as before (1,1),(2,2) */;

The assertion of the last select does not hold.

How to repeat:
run binlog_killed or its parts like in the HOW-2-REPEAT on slow env.

Suggested fix:
clean up the killed status of a statement at the end of the statement not at the beginning of the successive one.
[30 Nov 2007 13:16] Alexander Nozdrin
Pushed into 5.1-runtime.
[6 Dec 2007 9:59] Bugs System
Pushed into 5.1.23-rc
[6 Dec 2007 10:01] Bugs System
Pushed into 6.0.5-alpha
[18 Dec 2007 4:06] Paul Dubois
Noted in 5.1.23, 6.0.5 changelogs.

Killing a statement could lead to a race condition in the server.