Bug #47966 Failure at LGMAN line 1342 when using ndbmtd with (too) small undo-buffer-size
Submitted: 10 Oct 2009 4:50 Modified: 14 Oct 2009 14:43
Reporter: John David Duncan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:mysql-5.1-telco-7.0 OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any
Tags: 7.0.8a

[10 Oct 2009 4:50] John David Duncan
Description:

2009-10-09 
  20:16:57  Node 9 fails Error 2341  LGMAN (Line: 1342) thr 3
  20:16:57  Node 12 fails Error 2341  LGMAN (Line: 1342) thr 2
  20:16:57  Node 8  fails Error 2341  LGMAN (Line: 1342) thr 3

  20:17:07  Detected GCP stop
  20:17:07  Node 11 killed due to GCP stop

  20:17:08  Arbitrator shuts down nodes 5/6/7/10 

cat test8.tgz.aa test8.tgz.ab > test8.tar.gz 
tar xzf test8.tar.gz

How to repeat:
unknown
[10 Oct 2009 4:54] John David Duncan
First part of crash logs

Attachment: test8.tgz.aa (application/octet-stream, text), 500.00 KiB.

[10 Oct 2009 4:56] John David Duncan
Second part of crash logs.  (cat together to make test8.tar.gz)

Attachment: test8.tgz.ab (application/octet-stream, text), 430.00 KiB.

[13 Oct 2009 12:52] Andy Lintner
Experienced the same crash myself this morning.

Time: Tuesday 13 October 2009 - 08:34:23
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: lgman.cpp
Error object: LGMAN (Line: 1342) 0x0000000a
Program: /usr/local/mysql//mysql/bin//ndbmtd
Pid: 3647 thr: 3
Version: mysql-5.1.37 ndb-7.0.8
Trace: /data/mysqlcluster//ndb_3_trace.log.10 /data/mysqlcluster//ndb_3_trace.log.10_t1 /data/mysqlcluster//ndb_3
[13 Oct 2009 12:53] Andy Lintner
Trace files from the crash

Attachment: trace.tar.gz (application/x-gzip, text), 285.62 KiB.

[14 Oct 2009 11:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/86789

3123 Jonas Oreland	2009-10-14
      ndb - bug#47966
        ndbmtd can over allocate undo-buffer, prevent this by keeping track
          of how much has been promised, but not yet consumed
[14 Oct 2009 13:13] Jonas Oreland
pushed to 7.0.9 and 7.1
[14 Oct 2009 14:43] Jon Stephens
Documented bugfix in the NDB-7.0.9 changelog as follows:

        In some cases, ndbmtd could allocate more space for the undo 
        buffer than was actually available, leading to a failure in 
        the LGMAN kernel block and subsequent failure of the data node.

Closed.
[15 Oct 2009 18:36] John David Duncan
After this fix is applied what will the behavior be when a tablespace is configured with a too small UNDO buffer?
[16 Oct 2009 5:03] Jonas Oreland
most likely gcp-stop
(if running sufficiently big transaction)
[26 Apr 2011 12:45] Jon Stephens
See also BUG#60946.