Bug #37994 InnoDb waits for lock when DELETE from a table previously in a interrupted trans
Submitted: 9 Jul 2008 16:46 Modified: 24 Jan 9:51
Reporter: Fredrik Wallroth
Status: No Feedback
Category:Server: InnoDB Severity:S2 (Serious)
Version:5.0.51a-community OS:Any (seen on windows server 2003 and linux)
Assigned to: Target Version:

[9 Jul 2008 16:46] Fredrik Wallroth
Description:
An insert that runs in a transaction is interrupted by network problems just before
commit.

Afterwards the following command:
DELETE FROM test;

(where test is table beeing inserted into) times out with the following error message:
"Lock wait timeout exceeded; try restarting transaction"

* It does not matter if the delete command is run from another machine not previously
connected to the database server.
* It does not help to have a false where clause on the delete query either.
* Selects against the table works fine.
* The problems seem to linger until the database is restarted, but it also seems to go
away by itself if waiting for several hours.

Workarounds:
* Don't use InnoDb.
* Don't use transactions.
* Restart database server every time the problem occurs.
* Drop the and recreate the table every time the problem occurs.

How to repeat:
Follow the steps to reproduce bug 37991:
http://bugs.mysql.com/37991

After the last step, open up a db client, connect to the database and enter the query:

delete from test;

Expected results:
* The query executes immediately and deletes 0 rows (since the table shall be empty). 
* The query is not waiting for any locks from an insert running on a connection that has
been broken.

Actual results:
* The query hangs for about 50 seconds (the timeout param amount of time) and the fails
with:
"Lock wait timeout exceeded; try restarting transaction"

Suggested fix:
Rollback transactions from broken connections.

If this is "as designed" and the broken connection (and it's transaction) is suppesed to
be kept alive very long, this is instead an error of the Connector/NET where it is not
possible get hold of and reset the "broken connection" which seem to remain in the pool
after network problems has occured during commit. See related bug 37991:
http://bugs.mysql.com/37991.
[10 Jul 2008 16:39] Susanne Ebrecht
This looks more like a .net bug.
[28 Jul 2008 18:19] Ken Jacobs
Can someone please test again, with code that contains the fix for
http://bugs.mysql.com/bug.php?id=37991?

That fix addresses issues with the driver code, and not InnoDB.  It seems this related
problem may also not be InnoDB-specific.
[30 Oct 2008 12:20] Fredrik Wallroth
I'v now tested this on the connector v 5.2.3.0 and get exactly the same problem.

(It would be strange if a fix in the connecter that fixes something that occurs after the
network connection to the db server has been broken could affect the server in any way).
[15 Dec 2008 12:53] Axel Schwenke
Fredrik,

this looks like normal behavior. I think what happens is this:

1. client connects to MySQL, adds rows to the table. The client gets locks on the new
records.

2. before the client commits the transaction (and thus releases the locks) the connection
between MySQL server and client breaks down.

3. new client connects, tries to delete all records from table.

4. client gives up after waiting unsuccessfully for the necessary locks.

The MySQL server has two measures to cope with that kind of problem:

wait_timeout: if a client is inactive for longer than wait_timeout seconds, the
connection is closed. Locks will be released, uncommitted transactions rolled back. This
timeout defaults to 8 hours which is probably longer than necessary for most users.
Timeout can be set globally and also per session.

TCP keepalive: the MySQL server enables TCP keepalive on the socket. This will make the
TCP stack send probe packets after a certain inactivity period and spot the broken
detection after not receiving answers. Timeout parameters are taken from the operating
system and are typically way too relaxed. Initial timeout is in the order of 2 hours. Can
be tuned in the Windows registry or in Linux sysfs.

You can check the above theory by running SHOW INNODB STATUS while the client is waiting
for the locks. It should show the delete waiting for locks and another transaction
holding the locks. SHOW PROCESSLIST will show the respective client connections. If the
dead connection is KILLed, the delete will succeed.
[15 Dec 2008 13:58] Fredrik Wallroth
Thank you Axel, we will investigate your proposals which probably is what we need to
handle our problem!

BR

Fredrik Wallroth
[24 Dec 2008 9:51] Valeriy Kravchuk
Please, inform about any results of your investigation.
[25 Jan 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".