MySQL Bugs: #47966: Failure at LGMAN line 1342 when using ndbmtd with (too) small undo-buffer-size

Bug #47966	Failure at LGMAN line 1342 when using ndbmtd with (too) small undo-buffer-size
Submitted:	10 Oct 2009 4:50	Modified:	14 Oct 2009 14:43
Reporter:	John David Duncan	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	mysql-5.1-telco-7.0	OS:	Any
Assigned to:	Jonas Oreland	CPU Architecture:	Any
Tags:	7.0.8a

Description:

2009-10-09 
  20:16:57  Node 9 fails Error 2341  LGMAN (Line: 1342) thr 3
  20:16:57  Node 12 fails Error 2341  LGMAN (Line: 1342) thr 2
  20:16:57  Node 8  fails Error 2341  LGMAN (Line: 1342) thr 3

  20:17:07  Detected GCP stop
  20:17:07  Node 11 killed due to GCP stop

  20:17:08  Arbitrator shuts down nodes 5/6/7/10 

cat test8.tgz.aa test8.tgz.ab > test8.tar.gz 
tar xzf test8.tar.gz

How to repeat:
unknown

First part of crash logs

Attachment: test8.tgz.aa (application/octet-stream, text), 500.00 KiB.

Second part of crash logs.  (cat together to make test8.tar.gz)

Attachment: test8.tgz.ab (application/octet-stream, text), 430.00 KiB.

Experienced the same crash myself this morning.

Time: Tuesday 13 October 2009 - 08:34:23
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: lgman.cpp
Error object: LGMAN (Line: 1342) 0x0000000a
Program: /usr/local/mysql//mysql/bin//ndbmtd
Pid: 3647 thr: 3
Version: mysql-5.1.37 ndb-7.0.8
Trace: /data/mysqlcluster//ndb_3_trace.log.10 /data/mysqlcluster//ndb_3_trace.log.10_t1 /data/mysqlcluster//ndb_3

Trace files from the crash

Attachment: trace.tar.gz (application/x-gzip, text), 285.62 KiB.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/86789

3123 Jonas Oreland	2009-10-14
      ndb - bug#47966
        ndbmtd can over allocate undo-buffer, prevent this by keeping track
          of how much has been promised, but not yet consumed

pushed to 7.0.9 and 7.1

Documented bugfix in the NDB-7.0.9 changelog as follows:

        In some cases, ndbmtd could allocate more space for the undo 
        buffer than was actually available, leading to a failure in 
        the LGMAN kernel block and subsequent failure of the data node.

Closed.

After this fix is applied what will the behavior be when a tablespace is configured with a too small UNDO buffer?

most likely gcp-stop
(if running sufficiently big transaction)

See also BUG#60946.