Bug #32040 Failed ndbrequire in DbtuxScan.cpp causing NDB to shutdown
Submitted: 2 Nov 2007 1:26 Modified: 9 Dec 2013 12:05
Reporter: Steve Kurzeja Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:mysql-5.0 OS:Linux (2.6.17-10)
Assigned to: Assigned Account CPU Architecture:Any
Tags: 5.0.45, cluster

[2 Nov 2007 1:26] Steve Kurzeja
Description:
I have a reoccurring error with a production MySQL Cluster. One recent example from ndb error log.

Time: Thursday 1 November 2007 - 17:04:45
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbtuxScan.cpp
Error object: DBTUX (Line: 817) 0x0000000a
Program: /usr/local/mysql/bin/ndbd
Pid: 17041
Trace: /var/lib/mysql/data/ndb_2_trace.log.20
Version: Version 5.0.45
***EOM***

Trace file attached.

The source code line in DbtuxScan.cpp it is failing on is inside Dbtux::scanNext method:

  // cannot be moved away from tuple we have locked
  ndbrequire(scan.m_state != ScanOp::Locked);

The cluster configuration is two data nodes, 4 active SQL nodes and one management node. ndb_mgmd.cnf attached.

The cluster has been running for about 6months. This error used to occur less frequently (maybe once every 3 weeks) but now it occurs daily. There's been no change to the nature of the load. In some instances both are data nodes shutdown resulting in complete loss of service.

The cluster does a lot of ordered index queries so it looks to be related to the number of concurrent range scans, given that its failing in DbTuxScan.cpp also. I can provide more detail about the nature of the queries as required.  

How to repeat:
Unfortunately I cannot replicate this myself. It is occurring in production intermittently - at least once a day.
[16 Mar 2009 18:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/69338

2774 Pekka Nousiainen	2009-03-16
      bug#32040 01_dbtux.diff
      scanNext: if Locked, try to unlock instead of crashing
      committed for easy access but not pushed at this time
      not tested since cannot reproduce the bug
      modified:
        ndb/src/kernel/blocks/dbtux/DbtuxScan.cpp
[9 Dec 2013 12:05] Jon Stephens
Per discussion with developer, it appears this was fixed in NDB 6.3.47, 7.0.28, 7.1.17.

Closed.