Bug #21800 scan timeouts do not release scan records causing cluster to hang
Submitted: 23 Aug 2006 21:13 Modified: 6 Sep 2006 12:01
Reporter: Jonathan Miller Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:4.1,5.0,5.1 OS:Linux (Linux)
Assigned to: Jonas Oreland CPU Architecture:Any

[23 Aug 2006 21:13] Jonathan Miller
Description:
REF: http://bugs.mysql.com/bug.php?id=21124

Trancations scans would get deadlocked and would time out after 2 min.
When the scans time out, they are not releasing the scan records. After a
while, the cluster runs out of scan records and stops processing. Basically the
cluster is useless at this point until restarted

How to repeat:
REF: http://bugs.mysql.com/bug.php?id=21124

Suggested fix:
Make sure records are released when a scan times out
[23 Aug 2006 21:15] Jonathan Miller
This maybe downgrade to a P2 as there is a workaround of increasing the MaxNoOfConcurrentScans, and decreasing TransactionDeadlockDetectionTimeout but since is does cause a hang, we wanted to start out with P1.
[24 Aug 2006 5:05] Jonas Oreland
the bug will only surface if TransactionDeadlockDetectionTimeout is > 120000
which is insane in the first place.

lowering prio
[24 Aug 2006 5:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10808

ChangeSet@1.2543, 2006-08-24 07:14:46+02:00, jonas@perch.ndb.mysql.com +1 -0
  ndb - bug#21800
    read TransactionDeadlockTimeout (for scans) to cater for insane settings
[24 Aug 2006 5:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10811

ChangeSet@1.2282, 2006-08-24 07:25:54+02:00, jonas@perch.ndb.mysql.com +1 -0
  ndb - 
    bug#21800 - 5.0 -> 5.1 merge
[24 Aug 2006 16:59] Omer Barnir
This change is done more of an admin note since the issue is fixed

Not sure if the 2 minute config is 'insane', I agree that this is somewhat of a corner case so although it leads to a hang it shouldn't be a P1 but becasue of the hang it should not go down to a P3.
[1 Sep 2006 8:05] Jonas Oreland
pushed to 5.1.12
[6 Sep 2006 7:07] Jonas Oreland
pushed into 4.1.22 and 5.0.25
[6 Sep 2006 12:01] Jon Stephens
Documented bugfix in 4.1.22/5.0.25/5.1.12 changelogs.