Bug #25286 NDB data node crashed in DBLQH, Line 2483
Submitted: 26 Dec 2006 21:10 Modified: 14 Mar 2007 10:08
Reporter: Anatoly Pidruchny (Candidate Quality Contributor) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.1.14 OS:Linux (Linux x86_64)
Assigned to: Jonas Oreland CPU Architecture:Any
Tags: qc

[26 Dec 2006 21:10] Anatoly Pidruchny
Description:
ndbd process crashed after many days (about 12 days) of work. The error log is:

Current byte-offset of file-pointer is: 568                       

Time: Tuesday 26 December 2006 - 10:18:18
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: dblqh/DblqhMain.cpp
Error object: DBLQH (Line: 2483) 0x0000000e
Program: ndbd
Pid: 15041
Trace: /sm/mysql/ndb_data/ndb_3_trace.log.1
Version: Version 5.1.14 (beta)
***EOM***

How to repeat:
It is unknown what exactly triggered the crash and if the crash can be reproduced. Hopefully the attached config.ini file and logs will help to find the root cause of the problem.
[26 Dec 2006 21:17] Anatoly Pidruchny
Cluster configuration file

Attachment: config.ini (application/octet-stream, text), 861 bytes.

[26 Dec 2006 21:17] Anatoly Pidruchny
Trace log of the crashed data node

Attachment: ndb_3_trace.log.1.gz (application/x-gzip, text), 64.58 KiB.

[26 Dec 2006 21:18] Anatoly Pidruchny
Out log of the crashed data node

Attachment: ndb_3_out.log.gz (application/x-gzip, text), 20.89 KiB.

[26 Dec 2006 21:18] Anatoly Pidruchny
Cluster log file

Attachment: ndb_1_cluster.log.gz (application/x-gzip, text), 67.47 KiB.

[22 Jan 2007 21:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18583

ChangeSet@1.2375, 2007-01-22 22:05:56+01:00, jonas@eel.(none) +2 -0
  ndb - bug#25286
    - add some sanity check to marker/hash code to see that element isnt inserted twice into hashtable
      (if defined VM_TRACE or ERROR_INSERT)
  
    - allow REMOVE_MARKER_ORD to fail(dont find record) in release
[22 Jan 2007 21:11] Jonas Oreland
Hi,

sorry for late reply on this bug-report.
i analyzed the logs carefully wo/ finding bug :-(

the bug is related to api 5 disconnecting...and the code for this has not changed since before 4.1 release...

in the patch attached to this bug, I added some extra check
that will be run of #VM_TRACE or #ERROR_INSERT is defined.

if you could test this using NDB_EXTRA_FLAGS="-DERROR_INSERT" that would be great.

i'll run this in our internal tests, 
but I'm dont find it likely that I'll find something...as I never seen this crash before...

---

How is you cluster going overall?

/Jonas
[22 Jan 2007 21:43] Anatoly Pidruchny
Hi, Jonas,

we are using the MySQL/NDB version 5.1.14 with two patches for bug 24664 applied (these patches should be included in 5.1.15). The Cluster is doing very well! We practically do not have any problems with NDB any more. There were no crashes for almost a month.

I think it does not make sense for me to build the software with NDB_EXTRA_FLAGS="-DERROR_INSERT", because the crash is not reproducible. I do not mind if you just close this bug report. I just hoped that the information in the logs will be enough to find the bug in the code.

Thanks,

/Anatoly.
[7 Feb 2007 17:11] Tomas Ulin
pushed to 5.1.16
[14 Mar 2007 10:08] Jonas Oreland
i'll close this as can't repeat...
as I can't repeat it...

please reopen if you get more info/problems

/Jonas