Bug #47745 innodb.innodb-timeout fails sporadically
Submitted: 30 Sep 2009 13:23 Modified: 10 Jan 2011 15:35
Reporter: Alexander Nozdrin Email Updates:
Status: Closed Impact on me:
None 
Category:Tests: Engine Severity:S3 (Non-critical)
Version:5.4.5 (mysql-trunk) OS:Any
Assigned to: Matthias Leich CPU Architecture:Any
Tags: pb2, sporadic, test failure
Triage: Triaged: D3 (Medium)

[30 Sep 2009 13:23] Alexander Nozdrin
Description:
mysqltest: At line 56: query 'reap' succeeded - should have failed with errno 1205...

How to repeat:
http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=&test_ou...
[7 Oct 2009 6:50] Sveta Smirnova
Not repeatable on test box for me, although trying to verify I found bug #47873

Results from pushbuild:

innodb.innodb-timeout                    w6 [ fail ]
        Test ended at 2009-09-29 12:16:14

CURRENT_TEST: innodb.innodb-timeout
==5841== Memcheck, a memory error detector.
==5841== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
==5841== Using LibVEX rev 1658, a library for dynamic binary translation.
==5841== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
==5841== Using valgrind-3.2.1, a dynamic binary instrumentation framework.
==5841== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
==5841== For more details, rerun with: -v
==5841== 
mysqltest: At line 56: query 'reap' succeeded - should have failed with errno 1205...

The result from queries just before the failure was:
< snip >
set innodb_lock_wait_timeout=1;
select @@innodb_lock_wait_timeout;
@@innodb_lock_wait_timeout
1
select @@innodb_lock_wait_timeout;
@@innodb_lock_wait_timeout
347
create table t1(a int primary key)engine=innodb;
begin;
insert into t1 values(1),(2),(3);
select * from t1 for update;
commit;
a
1
2
3
begin;
insert into t1 values(4);
select * from t1 for update;
commit;

Error 1205 is "Lock wait timeout". So this can be either timing problem or innodb_lock_wait_timeout was set to incorrect value. Can not say more without access to the box where test failed.
[9 Dec 2010 14:43] Matthias Leich
IMHO the test itself has too optimistic expectations about timing.
Thinkable negative scenario:
1. connection a runs
   begin;
   insert into t1 values(4);
   which causes a lock
2. connection b sends
   select * from t1 for update;
   to the server
   The execution of this command starts delayed
   (caused by send) after some time span which
   depends on OS/MySQL scheduling and specially
   on temporary high parallel CPU load on the testing box.
   Experiments showed that
   - maximum time span of 2 seconds
   - it could happen that the next SQL statement
     issued by another connection gets executed
     before the execution of statement issued
     via "--send" started.at all
   Of course than the statement of connection b
   will block in case the lock held by connection
   a is not removed.
time in seconds    event
0                  connection a causes the lock
0.1                connection b sends its statement
                   to the server
0.2 till 2.2       connection a waits two seconds
2.0                server starts execution of the statement
                   issued via "--send" and collides with the
                   lock held by connection a
2.3                connection a issues commit
2.4                lock held by connection a is removed
2.5                lock wait time out of 1 second has not
                   passed for connection a
                   server detects the removed lock and finishes
                   the statement wit success instead of
                   ER_LOCK_WAIT_TIMEOUT
I propose to increase the sleep time to 3.
[9 Dec 2010 14:52] Matthias Leich
I also recommend to ensure that the disconnects
are finished before the test ends.
This part of the fix has no connection to the
bug report, but it drastically reduces the likelihood
that this test harms succeeding tests when we have
high parallel load on the testing box.
[10 Dec 2010 14:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/126526

3173 Matthias Leich	2010-12-10
      Fix for Bug#47745 innodb.innodb-timeout fails sporadically
      - Second scenario checked:
        Ensure via wait routines that the commit comes after the
        processing of the statement which should get finally
        the ER_LOCK_WAIT_TIMEOUT
        --> This should prevent the current bug.
      - First scenario checked:
        Ensure via wait routines that the statement is already waiting
        for getting the lock before the commit is given.
        --> No effect on the current bug, but ensure that the right
            scenario is reached.
      - Take care that disconnects are finished before the test ends.
        --> Reduce the potential to harm succeeding tests.
      - "Mangle" the printout of the current default innodb_lock_wait_timeout value
        --> No need to adjust the test in case the default gets changed in future.
[13 Dec 2010 15:51] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/126673

3173 Matthias Leich	2010-12-13
      Fix for Bug#47745 innodb.innodb-timeout fails sporadically
      - Second scenario checked:
        Ensure via wait routines that the commit comes after the
        processing of the statement which should get finally
        the ER_LOCK_WAIT_TIMEOUT
        --> This should prevent the current bug.
      - First scenario checked:
        Ensure via wait routines that the statement is already waiting
        for getting the lock before the commit is given.
        --> No effect on the current bug, but ensure that the right
            scenario is reached.
      - Take care that disconnects are finished before the test ends.
        --> Reduce the potential to harm succeeding tests.
      - "Mangle" the printout of the current default innodb_lock_wait_timeout value
        --> No need to adjust the test in case the default gets changed in future.
[17 Dec 2010 15:14] John Embretsen
Matthias,

I tried out the patch on Solaris (x64) and Mac OS X (10.6, 64-bit), and saw no issues. I did not attempt to generate particularly high parallel load, as I understand you have tested that.

The patch looks fine to me. The only comment I found slightly strange was:

+# High parallel load especially on the disk containing the vardir can delay the
+#   The server starts the execution of connection b: select * from t1 for update
+#   and begins to wait for the removal of the lock held by connection a.
+# The maximum delay found in other tests was two seconds.

There seems to be something missing after the "the" in the first sentence, or maybe it is just not natural language to me. Feel free to change this before pushing, but I won't force you. I think I understood what you meant anyway.
[7 Jan 2011 13:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/128166

3220 Matthias Leich	2011-01-07
      Fix for Bug#47745 innodb.innodb-timeout fails sporadically
      - Second scenario checked:
        Ensure via wait routines that the commit comes after the
        processing of the statement which should get finally
        the ER_LOCK_WAIT_TIMEOUT
        --> This should prevent the current bug.
      - First scenario checked:
        Ensure via wait routines that the statement is already waiting
        for getting the lock before the commit is given.
        --> No effect on the current bug, but ensure that the right
            scenario is reached.
      - Take care that disconnects are finished before the test ends.
        --> Reduce the potential to harm succeeding tests.
      - "Mangle" the printout of the current default innodb_lock_wait_timeout value
        --> No need to adjust the test in case the default gets changed in future.
[7 Jan 2011 16:40] Bugs System
Pushed into mysql-trunk 5.6.2 (revid:matthias.leich@oracle.com-20110107161048-6x3vunwcyt3jl4pe) (version source revid:matthias.leich@oracle.com-20110107132442-7cyrhzi5nz6rh5pd) (merge vers: 5.6.2) (pib:24)
[7 Jan 2011 16:40] Bugs System
Pushed into mysql-5.5 5.5.9 (revid:matthias.leich@oracle.com-20110107133746-6j11qtmifhjq0mw2) (version source revid:matthias.leich@oracle.com-20110107131628-j793lomvqeh55x8k) (merge vers: 5.5.9) (pib:24)
[10 Jan 2011 15:35] Paul Dubois
Changes to test suite. No changelog entry needed.