| Bug #47966 | Failure at LGMAN line 1342 when using ndbmtd with (too) small undo-buffer-size | ||
|---|---|---|---|
| Submitted: | 10 Oct 6:50 | Modified: | 14 Oct 16:43 |
| Reporter: | John David Duncan | ||
| Status: | Closed | ||
| Category: | Server: Cluster | Severity: | S2 (Serious) |
| Version: | mysql-5.1-telco-7.0 | OS: | Any |
| Assigned to: | Jonas Oreland | Target Version: | |
| Tags: | 7.0.8a | ||
| Triage: | Triaged: D1 (Critical) / R6 (Needs Assessment) / E6 (Needs Assessment) | ||
[10 Oct 6:50]
John David Duncan
[10 Oct 6:54]
John David Duncan
First part of crash logs
Attachment: test8.tgz.aa (application/octet-stream, text), 500.00 KiB.
[10 Oct 6:56]
John David Duncan
Second part of crash logs. (cat together to make test8.tar.gz)
Attachment: test8.tgz.ab (application/octet-stream, text), 430.00 KiB.
[13 Oct 14:52]
Andy Lintner
Experienced the same crash myself this morning. Time: Tuesday 13 October 2009 - 08:34:23 Status: Temporary error, restart node Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug) Error: 2341 Error data: lgman.cpp Error object: LGMAN (Line: 1342) 0x0000000a Program: /usr/local/mysql//mysql/bin//ndbmtd Pid: 3647 thr: 3 Version: mysql-5.1.37 ndb-7.0.8 Trace: /data/mysqlcluster//ndb_3_trace.log.10 /data/mysqlcluster//ndb_3_trace.log.10_t1 /data/mysqlcluster//ndb_3
[13 Oct 14:53]
Andy Lintner
Trace files from the crash
Attachment: trace.tar.gz (application/x-gzip, text), 285.62 KiB.
[14 Oct 13:45]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/86789 3123 Jonas Oreland 2009-10-14 ndb - bug#47966 ndbmtd can over allocate undo-buffer, prevent this by keeping track of how much has been promised, but not yet consumed
[14 Oct 15:13]
Jonas Oreland
pushed to 7.0.9 and 7.1
[14 Oct 16:43]
Jon Stephens
Documented bugfix in the NDB-7.0.9 changelog as follows:
In some cases, ndbmtd could allocate more space for the undo
buffer than was actually available, leading to a failure in
the LGMAN kernel block and subsequent failure of the data node.
Closed.
[15 Oct 20:36]
John David Duncan
After this fix is applied what will the behavior be when a tablespace is configured with a too small UNDO buffer?
[16 Oct 7:03]
Jonas Oreland
most likely gcp-stop (if running sufficiently big transaction)
