Bug #70037 ndbd crash by error code 2341
Submitted: 14 Aug 2013 17:29 Modified: 16 Sep 2014 11:08
Reporter: Bill Boatman Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:7.2.12 OS:Linux (RedHat 5.9)
Assigned to: Gustaf Thorslund CPU Architecture:Any

[14 Aug 2013 17:29] Bill Boatman
Description:
We recently implemented a MySQL cluster consisting of 2 management nodes, 2 api nodes, and 1 data node with a second data node in process. All nodes are installed RedHat 5.9 physical servers. The physical systems have 8 cores and 16 GB ram each.

The cluster ran fine for over a month during testing with very little load. 

When the cluster was put into production use we received an error after about 24 hours of running.

Time: Wednesday 14 August 2013 - 08:29:49
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: Dbdict.cpp
Error object: DBDICT (Line: 27149) 0x00000002
Program: ndbmtd
Pid: 27392 thr: 0
Version: mysql-5.5.30 ndb-7.2.12
Trace: /mysql/data/ndb_10_trace.log.5 [t1..t7]
***EOM***
 
2013-08-14 08:29:49 [ndbd] INFO     -- /export/home/pb2/build/sb_0-8660699-1363117723.09/rpm/BUILD/mysql-cluster-gpl-7.2.12/mysql-cluster-gpl-7.2.12/storage/ndb/src/kernel/blocks/dbdict/Dbdict.cpp
2013-08-14 08:29:49 [ndbd] INFO     -- DBDICT (Line: 27149) 0x00000002
2013-08-14 08:29:49 [ndbd] INFO     -- Error handler shutting down system
2013-08-14 08:29:49 [ndbd] INFO     -- Error handler shutdown completed - exiting
2013-08-14 08:29:52 [ndbd] ALERT    -- Node 10: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

After getting the node to restart we had complete data loss.

How to repeat:
Unknown
[19 Aug 2013 9:16] Gustaf Thorslund
Hi Bill,

In your config.ini file I see:
NoOfReplicas=1 # Number of replicas should be 2

In your ndb_1_cluster.log I see:
2013-08-14 08:11:48 [MgmtSrvr] INFO     -- Node 11: reorg-copy table 242 processed 0 rows
2013-08-14 08:11:49 [MgmtSrvr] INFO     -- Node 10: reorg-copy table 242 processed 4140 rows
2013-08-14 08:11:56 [MgmtSrvr] INFO     -- Node 11: reorg-delete table 242 processed 0 rows
2013-08-14 08:11:57 [MgmtSrvr] INFO     -- Node 10: reorg-delete table 242 processed 3628 rows

Are you here trying to do an online addition of your second node? In that case you would still end up with NoReplicas=1 when done. If you look at:
  http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-online-add-node.html
there is a note there:

-->
...  In addition, it is not possible to change the number of replicas (or the number of nodes per node group) online. 
-->

/Gustaf
[20 Sep 2013 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[26 Sep 2013 14:22] Shraddha Karpe
Mysql Cluster Config file

Attachment: config.ini (application/octet-stream, text), 2.28 KiB.

[26 Sep 2013 14:27] Shraddha Karpe
Hello Team, we are also facing same issue as:
Time: Wednesday 25 September 2013 - 10:12:10
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbaccMain.cpp
Error object: DBACC (Line: 1428) 0x00000002
Program: ndbd
Pid: 6424
Version: mysql-5.5.30 ndb-7.2.12
Trace: /usr/local/mysql/data/ndb_3_trace.log.3 [t1..t1]
***EOM***

Time: Wednesday 25 September 2013 - 10:42:04
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbaccMain.cpp
Error object: DBACC (Line: 1428) 0x00000002
Program: ndbd
Pid: 9445
Version: mysql-5.5.30 ndb-7.2.12
Trace: /usr/local/mysql/data/ndb_3_trace.log.4 [t1..t1]
***EOM***

Time: Thursday 26 September 2013 - 00:50:28
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbaccMain.cpp
Error object: DBACC (Line: 1428) 0x00000002
Program: ndbd
Pid: 11417
Version: mysql-5.5.30 ndb-7.2.12
Trace: /usr/local/mysql/data/ndb_3_trace.log.5 [t1..t1]
***EOM***

We are having 2 VM server with RHEL 5.9 64 bit and 1 management node, 2 data node and 2 SQL NODE. 
I have attached MySQL config file and I'll atach trace file.
Please help.

Thanks,
Shraddha
[26 Sep 2013 14:29] Shraddha Karpe
Ndb error log file

Attachment: ndb_2_error.log (application/octet-stream, text), 2.02 KiB.

[26 Sep 2013 14:30] Shraddha Karpe
tracefile

Attachment: ndb_2_trace.log.3 (text/plain), 1.34 MiB.

[26 Sep 2013 14:34] Shraddha Karpe
Mysql trace file

Attachment: ndb_3_trace.log.4 (text/plain), 1.25 MiB.

[16 Sep 2014 11:08] Gustaf Thorslund
Bill Boatman,

As already said, this appears to be a configuration/user error. Got no feedback on the bug. Now I'm closing it as not a bug.

Shraddha Karpe,

You may have same error code (2341). This is, however, a fairly common error code (used for internal error). If you look closer at where the error happened it is in different source files. If it is a bug it is still something else than this "bug". Please open a new bug if this issue still occurs to you.

Regards,
Gustaf