Bug #51920 Innodb connections in row lock wait ignore KILL until lock wait timeout
Submitted: 10 Mar 2010 16:14 Modified: 30 Jun 2010 19:50
Reporter: Mark Callaghan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.0.84, 5.1.45, 5.5 OS:Any
Assigned to: John Russell CPU Architecture:Any
Tags: innodb, KILL, lock, row, wait

[10 Mar 2010 16:14] Mark Callaghan
Description:
Connections waiting on an InnoDB row lock ignore KILL until the row lock wait ends.

How to repeat:
create table foo(i int) engine=innodb;
insert into f values (1);

1) connection A
begin; select * from f for update;

2) connection B
begin; update f set i=2 where i=1;

3) connection B is now blocked on innodb row lock

4) connection A
KILL <connection_id_for B>

5) connection B does not respond until row lock wait timeout occurs:

|  2 | root | localhost | test | Killed  |   28 | Updating | update f set i=4 where i = 3 | 
...
|  2 | root | localhost | test | Killed  |   31 | Updating | update f set i=4 where i = 3 |
...
|  2 | root | localhost | test | Killed  |   38 | Updating | update f set i=4 where i = 3 | 

Suggested fix:
change srv_suspend_mysql_thread to let MySQL know a connection is blocked. Something like THD::enter_cond might help.
[10 Mar 2010 16:25] Valeriy Kravchuk
It is easy to verify this on both 5.0 and 5.1. Neither KILL nor KILL QUERY works. 

But be ready for this to be declared a feature request eventually, as our manual (http://dev.mysql.com/doc/refman/5.1/en/kill.html) does NOT list "Updating" status (even if it is waiting for lock actually) among cases when KILL flag is checked.
[25 Mar 2010 8:26] James Day
KILL is a core server feature and needs to work reasonably promptly so that it can be used to escape from trouble. Standard practice for the last four to five years has been to fix any discovered breakages in KILL operation expeditiously.

James Day, MySQL Senior Support Engineer, Oracle UK
[30 Mar 2010 13:23] Marko Mäkelä
Checking trx_is_interrupted() in addition to thd_lock_wait_timeout() in srv0srv.c seems to do the trick: the KILL seems to be honoured in less than a second.
[31 Mar 2010 8:12] Marko Mäkelä
Fix approved for 5.1 (both built-in InnoDB and InnoDB Plugin).
[5 May 2010 15:27] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[8 May 2010 17:41] Paul DuBois
Noted in 5.1.47 changelog.

Connections waiting for an InnoDB row lock ignored KILL until the row
lock wait ended.
[13 May 2010 7:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/108217

3094 Vasil Dimov	2010-05-13
      Followup to Bug#51920, fix binlog.binlog_killed
      
      This is a followup to the fix of
      
      Bug#51920 Innodb connections in row lock wait ignore KILL until lock wait
      timeout
      
      in that fix (rb://279) the behavior was changed to honor when a trx is
      interrupted during lock wait, but the returned error code was still
      "lock wait timeout" when it should be "interrupted".
      
      This change fixes the non-deterministically failing test binlog.binlog_killed,
      that failed like this:
      
      binlog.binlog_killed 'stmt'              [ fail ]
              Test ended at 2010-05-12 11:39:08
      
      CURRENT_TEST: binlog.binlog_killed
      mysqltest: At line 208: query 'reap' failed with wrong errno 1205: 'Lock wait timeout exceeded; try restarting transaction', instead of 0...
      
      Approved by:	Sunny Bains (rb://344)
[14 May 2010 6:59] Marko Mäkelä
Vasil Dimov pushed a further change in behaviour to mysql-trunk-innodb revno:3094. There, the KILL during lock wait would result in "query interrupted" instead of "lock wait timeout exceeded". I think that this change needs to be documented.
[14 May 2010 13:50] Mark Callaghan
The latest commit (http://lists.mysql.com/commits/108217) has changes for storage/innobase but not for storage/innodb_plugin. Why doesn't storage/innodb_plugin/row/row0mysql.c have a case for DB_INTERRUPTED in row_mysql_handle_errors?
[14 May 2010 15:30] Vasil Dimov
Mark, first the original fix was also present and fixed in MySQL 5.5/InnoDB 1.1.

But I only made this additional change in mysql-trunk-innodb (MySQL 5.5). There is no storage/innodb_plugin in that tree, since storage/innobase is the next branch of the plugin, called "InnoDB 1.1" (no plugin word since there is only one InnoDB).

We will probably port this post-fix to 5.1/storage/innobase and 5.1/storage/innodb_plugin too.

Thanks!
[14 May 2010 17:54] Paul DuBois
Revised changelog entry:

Connections waiting for an InnoDB row lock ignored KILL until the row
lock wait ended. Now, KILL during lock wait results in "query
interrupted" instead of "lock wait timeout exceeded".
[14 May 2010 19:08] Mark Callaghan
After backporting this to 5.1, a test that we added fails and I think the test should not fail. This occurs on 5.1 with backports by us. I think the test is from us too. Do you have tests for this behavior?

Run this:
./mysql-test-run.pl innodb.innodb_bug51920

And get this result
>>>
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
innodb.innodb_bug51920                   [ fail ]  Found warnings/errors in server log file!
        Test ended at 2010-05-14 11:47:26
line
100514 21:47:24 [ERROR] Got error -1 when reading table './test/lockt'
^ Found warnings in /s/bld/5146dev/mysql-test/var/log/mysqld.1.err
ok
>>>

This is the cause. This returns -1 on interrupt and a message is then written to the db error log. I don't want more spam in that log.

/********************************************************************//**
Converts an InnoDB error code to a MySQL error code and also tells to MySQL
about a possible transaction rollback inside InnoDB caused by a lock wait
timeout or a deadlock.
@return MySQL error code */
extern "C" UNIV_INTERN
int
convert_error_code_to_mysql(
/*========================*/
        int     error,  /*!< in: InnoDB error code */
        ulint   flags,  /*!< in: InnoDB table flags, or 0 */
        THD*    thd)    /*!< in: user thread handle or NULL */
{
        switch (error) {
        case DB_SUCCESS:
                return(0);

        case DB_INTERRUPTED:
                my_error(ER_QUERY_INTERRUPTED, MYF(0));
                /* fall through */
        case DB_ERROR:
        default:
                return(-1); /* unspecified error */

--------------

The test file: mysql-test/suite/innodb/t/innodb_bug51920.test
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# Test kill query terminates innodb queries waiting on a row lock

-- source include/have_innodb.inc

#
# setup
#
--disable_warnings
drop table if exists lockt;
--enable_warnings

create table lockt(i int primary key) engine=innodb;
insert into lockt values (1), (2);

connect (con1,localhost,root,,);
connect (con2,localhost,root,,);

#
# verify killing terminates lock
#
connection con1;
begin;
select * from lockt where i=1 for update;

connection con2;
let $cid=`select connection_id()`
begin;
send select * from lockt where i=1 for update;

connection con1;
sleep 1;
--disable_query_log
eval kill query $cid;
--enable_query_log
sleep 2;
# if innodb correctly responded to kill request then there should be 0 waits
show status like 'Innodb_row_lock_current_waits';
rollback;

connection con2;
--error ER_QUERY_INTERRUPTED
reap;

connection default;
disconnect con1;
disconnect con2;

#
# cleanup
#
drop table lockt;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

The file mysql-test/suite/innodb/t/innodb_bug51920-master.opt 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
--innodb_lock_wait_timeout=10
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

The result file mysql-test/suite/innodb/r/innodb_bug51920.result
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
drop table if exists lockt;
create table lockt(i int primary key) engine=innodb;
insert into lockt values (1), (2);
begin;
select * from lockt where i=1 for update;
i
1
select * from lockt where i=1 for update;
show status like 'Innodb_row_lock_current_waits';
Variable_name	Value
Innodb_row_lock_current_waits	0
rollback;
ERROR 70100: Query execution was interrupted
drop table lockt;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

This might fix the problem:

diff --git a/storage/innodb_plugin/handler/ha_innodb.cc b/storage/innodb_plugin/handler/ha_innodb.cc
index 06c16a3..0fad4d4 100644
--- a/storage/innodb_plugin/handler/ha_innodb.cc
+++ b/storage/innodb_plugin/handler/ha_innodb.cc
@@ -1031,7 +1031,8 @@ convert_error_code_to_mysql(
 
        case DB_INTERRUPTED:
                my_error(ER_QUERY_INTERRUPTED, MYF(0));
-               /* fall through */
+               return(HA_ERR_GENERIC);
+
        case DB_ERROR:
        default:
                return(-1); /* unspecified error */
[14 May 2010 19:16] Calvin Sun
Re-open for porting the post-fix to 5.1.
[20 May 2010 7:58] Vasil Dimov
http://lists.mysql.com/commits/108731
http://lists.mysql.com/commits/108732
[20 May 2010 8:20] Marko Mäkelä
Backporting the changes from 5.5 to 5.1 (builtin InnoDB and InnoDB plugin) looks OK to me. There are two changes in behaviour that have to be documented:

* We roll back killed transactions.
* When a transaction is killed during lock wait, it will report "query interrupted" rather than "lock wait timeout".
[20 May 2010 8:22] Vasil Dimov
Mark, I do not see anything printed to the mysqld stdout/stderr when I return -1.
[28 May 2010 6:14] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:42] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 7:09] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[15 Jun 2010 8:22] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:marko.makela@oracle.com-20100514130815-ym7j7cfu88ro6km4) (merge vers: 5.1.48) (pib:16)
[15 Jun 2010 8:40] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:marko.makela@oracle.com-20100514130815-ym7j7cfu88ro6km4) (pib:16)
[17 Jun 2010 12:02] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:43] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:29] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[30 Jun 2010 19:50] John Russell
Updated change log that this is now fixed in 5.5, and added a note:

The corresponding transaction is rolled back.