Bug #43068 data node fails on startup with error 2308: pointer too large
Submitted: 20 Feb 2009 21:47 Modified: 8 Apr 2009 8:08
Reporter: Robin McMillon Email Updates:
Status: No Feedback Impact on me:
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:6.3.20 OS:Solaris (10 x86)
Assigned to: Assigned Account CPU Architecture:Any

[20 Feb 2009 21:47] Robin McMillon
This bug is related to Bug 38871 (http://bugs.mysql.com/bug.php?id=38871)

Cluster is comprised of: 6 Sun x4150
Each is running:
OS: OpenSolaris x86
xVM w/ one domain - cluster is the only app running on these machines
MySQL Cluster: 6.3.20 (MySQL built tarball)

Cluster setup:
2 x4150 as management and SQL nodes (Nodes 1 & 2: management, Nodes 7 & 8: SQL)
4 x4150 as data nodes (Nodes 3, 4, 5, 6)

On startup, node 5 fails with the following error:

Time: Thursday 19 February 2009 - 15:01:02
Status: Temporary error, restart node
Message: Pointer too large (Internal error, programming error or missing error message, please report a bug)
Error: 2306
Error data: dbdih/DbdihMain.cpp
Error object: DBDIH (Line: 14764) 0x0000000a
Program: ndbd
Pid: 9806
Trace: /z1/mysql/DATA/6.3-cluster/ndb_5_trace.log.3
Version: mysql-5.1.30 ndb-6.3.20-GA

Nothing had changed in Node 5's configuration between when the last successful restart and this failure so I attempted to restart the cluster - no luck: same error.

I then checked the MySQL bug system.  This appears to be a problem that has been reported in various versions since 4.1.x with no resolution.  Bug 38871 claimed that restarting his failing node with --initial fixed the problem so I attempted that and it worked.

Workaround: --initial restart of Node 3 allows the startup.

How to repeat:
Relevant trace file is attached.

Suggested fix:
[20 Feb 2009 21:51] Robin McMillon
first time this error came up

Attachment: ndb_5_trace.log.2.gz (application/x-gzip, text), 32.74 KiB.

[20 Feb 2009 21:51] Robin McMillon
first time this error came up

Attachment: ndb_5_trace.log.2.gz (application/x-gzip, text), 32.74 KiB.

[20 Feb 2009 21:52] Robin McMillon
tried to restart, got same error (before tried --initial)

Attachment: ndb_5_trace.log.3.gz (application/x-gzip, text), 46.19 KiB.

[23 Feb 2009 11:07] Jonas Oreland

To proceed on this, we need the cluster log and config.ini
(error log wouldnt either be bad)

From what I can see in trace-file, there doesnt seem to be a bug,
but rather that nodes has been restarted in a sequence that made it possible
to come up.

Setting status to need feedback

[6 Mar 2009 16:08] Robin McMillon
I've uploaded the error log for node 5 and the config.ini for the cluster.  Unfortunately the cluster log has aged out.  If I see it again, I will add *all* the relevant files.

Also, "Workaround: --initial restart of Node 3 allows the startup." should have said Node 5 instead.  When you see the cluster come back up, that was after I performed the 'ndbd --initial' on Node 5.
[8 Mar 2009 8:08] Jonas Oreland
setting back to waiting on feedback,
waiting for more logs if problem occurs again
[8 Apr 2009 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".