Bug #40990 Maria: failure of maria.test & maria_notemebedded in deadlock detection
Submitted: 24 Nov 2008 18:52 Modified: 10 Mar 2009 13:52
Reporter: Guilhem Bichot Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Maria storage engine Severity:S3 (Non-critical)
Version:5.1-maria,6.0-maria OS:Solaris (sparc64)
Assigned to: Sergei Golubchik CPU Architecture:Any
Tags: pushbuild, sporadic, test failure

[24 Nov 2008 18:52] Guilhem Bichot
Description:
I have not checked 5.1-maria.
Solaris 10 Sparc 64 debug_max build:

Running:
mysql-test-run.pl --timer --force --comment=ps_stm_threadpool --ps-protocol --mysqld=--binlog-format=statement --mysqld=--thread-handling=pool-of-threads
...
maria.maria [ fail ]

mysqltest: At line 1493: query 'reap' failed with wrong errno 1213: 'Deadlock found when trying to get lock; try restarting transaction', instead of 1062...

How to repeat:
log into the machine I guess
[6 Dec 2008 12:30] Guilhem Bichot
Now that the relevant piece of maria.test moved to maria_notembedded.test, it's that test which fails:
test-max-sol10-sparc64
guilhem@mysql.co...
2008-12-05 22:42:49 
maria.maria_notembedded [ fail ]

mysqltest: At line 50: query 'reap' failed with wrong errno 1205: 'Lock wait timeout exceeded; try restarting transaction', instead of 1062...
[16 Dec 2008 9:26] Guilhem Bichot
Sanja ran the test on the failing machine, in a loop for 14 hours, no failure, using pushbuild2 binaries of Dec 14.
According to xref, all 5 failures (4 in 6.0-maria, one in 5.1-maria, all solaris 10 sparc64) were between Nov 22 and Dec 5. Shortly after the last failing push, Monty and Serg have pushed fixes for memory corruption bugs related to versioning and transaction manager, so that can be a possible reason why the problem is gone.
We close with "can't repeat" and will reopen if it fails again.
[16 Dec 2008 12:10] Alexander Nozdrin
Happened again:

https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-6.0&order=107

Symptoms:
mysqltest: At line 46: query 'insert t1 values (3)' failed with wrong errno 1205: 'Lock wait timeout exceeded; try restarting transaction', instead of 1213...
[17 Dec 2008 22:23] Guilhem Bichot
sent some ideas to Sanja and Serg
[22 Dec 2008 18:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/62218

2707 Sergei Golubchik	2008-12-22
      Bug#40990 Maria: failure of maria.test & maria_notemebedded in deadlock detection
      detect a case when a blocker has removed itself and signalled after the condition timed out
      but before it (cond_wait) acquired the mutex back
[7 Jan 2009 21:53] Guilhem Bichot
We cannot be sure that the problem fixed by this patch is the cause of the observed symptoms (those symptoms are rare, timing-dependent: hard to repeat), but it's quite possible.
[17 Feb 2009 11:47] Bugs System
Pushed into 6.0.10-alpha (revid:serg@mysql.com-20090217113558-vpsqsyjule7nz0gk) (version source revid:guilhem@mysql.com-20090213163054-rsg204z5qzcekbfe) (merge vers: 6.0.10-alpha) (pib:6)