MySQL Bugs: #75530: Unknown Error

Bug #75530	Unknown Error
Submitted:	16 Jan 2015 16:28	Modified:	4 Mar 2015 7:49
Reporter:	Joel Hanger	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	5.6.17-ndb-7.3.5	OS:	Linux (CentOS release 6.5 (Final))
Assigned to:		CPU Architecture:	Any
Tags:	2341, cluster shutdown, failed, halt, internal, ndbrequire, programming, Temporary error, unkown

Description:
Cluster size:
  2 management nodes
  3 api nodes
  4 data node groups with 2 replicas
  
The 3 api nodes are replicating data into the cluster from 3 active production servers that are sharded. This has been running for approximately 6 months without any issues beyond what has been expected. 
This morning the cluster had shut down with the following error message:

Management server log: 
2015-01-16 12:35:39 [MgmtSrvr] ALERT    -- Node 10: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

Node 10 error log:
Time: Friday 16 January 2015 - 12:35:38
Status: Temporary error, restart node
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbtcMain.cpp
Error object: DBTC (Line: 18397) 0x00000002
Program: ndbmtd
Pid: 9017 thr: 0
Version: mysql-5.6.17 ndb-7.3.5
Trace: /mnt/storage/mysql/data/ndb_10_trace.log.2 [t1..t4]
***EOM***

Yesterday some indexing changes were made to reduce DataMemory usage.
A planned change of MaxNoOfTriggers was going to be applied today in order to facilitate a change of indexes to hashes to further reduce DataMemory usage. 

At restart of data nodes the cluster was able to come back online. 

How to repeat:
Not sure of how this could be replicated.

After bringing the cluster up, 2 nodes ( from separate node groups ) lagged in starting up until the cluster was online. 

Nodes 8 & 9 were the lagging in this: 
[ndbd(NDB)]     8 node(s)
id=3    @10.1.5.XXX  (mysql-5.6.17 ndb-7.3.5, Nodegroup: 0, *)
id=4    @10.1.5.XXX  (mysql-5.6.17 ndb-7.3.5, Nodegroup: 0)
id=5    @10.1.5.XXX  (mysql-5.6.17 ndb-7.3.5, Nodegroup: 1)
id=6    @10.1.5.XXX  (mysql-5.6.17 ndb-7.3.5, Nodegroup: 1)
id=7    @10.1.5.XXX  (mysql-5.6.17 ndb-7.3.5, Nodegroup: 2)
id=8    @10.1.5.XXX  (mysql-5.6.17 ndb-7.3.5, Nodegroup: 2)
id=9    @10.1.5.XXX  (mysql-5.6.17 ndb-7.3.5, Nodegroup: 3)
id=10   @10.1.5.XXX  (mysql-5.6.17 ndb-7.3.5, Nodegroup: 3)

Node 7: Data usage is 59%(401062 32K pages of total 671328)
Node 8: Data usage is 66%(444062 32K pages of total 671328)

Node 9: Data usage is 68%(459314 32K pages of total 671328)
Node 10: Data usage is 60%(403285 32K pages of total 671328)

All the other node groups have the exact same usage however.. ie:

Node 3: Data usage is 58%(393240 32K pages of total 671328)
Node 4: Data usage is 58%(393240 32K pages of total 671328)

Thank you for the report.
I could not reproduce this issue at my end(tried with 7.3.7/8).
Could you please provide a repeatable test case to trigger this issue at our end? Also, see Bug #70217

Thanks,
Umesh

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".