Bug #77864 NDB ArrayPool<T>::getPtr Error 2301 Assertion 7.4.7 ArrayPool.hpp
Submitted: 28 Jul 2015 19:02 Modified: 30 Jul 2015 10:07
Reporter: Carl Krumins Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:7.4.7 OS:Linux
Assigned to: CPU Architecture:Any
Tags: NDB ArrayPool<T>::getPtr Error 2301 Assertion 7.4.7 ArrayPool.hpp

[28 Jul 2015 19:02] Carl Krumins
Description:
Hi

I recently reported bug #77701 with the same issue however the bug in previous report was with version 7.3.9
This report is with 7.4.7 version
started --initial on each node and loaded data in.

Same error as previous bug but on new version and slightly different line number.

Time: Tuesday 28 July 2015 - 12:18:19
Status: Temporary error, restart node
Message: Assertion (Internal error, programming error or missing error message, please report a bug)
Error: 2301
Error data: ArrayPool<T>::getPtr
Error object: /export/home/pb2/build/sb_0-15767947-1435843549.5/mysql-cluster-gpl-7.4.7/storage/ndb/src/kernel/vm/ArrayPool.hpp line: 515
Program: ndbmtd
Pid: 1155700 thr: 10
Version: mysql-5.6.25 ndb-7.4.7
Trace: /data/mysqlcluster//ndb_2_trace.log.7 [t1..t17]
***EOM***

Crashes all node groups and whole cluster as they all give the same Assertion error. 

How to repeat:
Not sure - happens randomly (every few hours)

Suggested fix:
not sure

Will upload ndb_error_reporter in a minute.
[28 Jul 2015 19:08] Carl Krumins
Uploaded mysql-bug-data-77864.tar.bz2 to sftp.oracle.com as instructed.
[29 Jul 2015 1:47] Carl Krumins
There are 4 nodes in 2 node groups.
This happens if all 4 nodes are up in the 2 nodegroups.

This crashes as above if either:
There is all 4 out of 4 nodes running
3 out of 3 nodes running
2 out of 4 nodes running - one node from each nodegroup.
[29 Jul 2015 14:38] Carl Krumins
extract from a node .out file

2015-07-18 12:01:20 [ndbd] INFO     -- Start phase 9 completed
2015-07-18 12:01:20 [ndbd] INFO     -- Phase 9 enabled APIs to start connecting
2015-07-18 12:01:20 [ndbd] INFO     -- Grant nodes to start phase: 10, nodes: 000000000000003c
2015-07-18 12:01:20 [ndbd] INFO     -- Start phase 101 completed
2015-07-18 12:01:20 [ndbd] INFO     -- Phase 101 was used by SUMA to take over responsibility for sending some of the asynchronous change events
2015-07-18 12:01:20 [ndbd] INFO     -- Grant nodes to start phase: 102, nodes: 000000000000003c
2015-07-18 12:01:20 [ndbd] INFO     -- Node started
2015-07-18 12:01:20 [ndbd] INFO     -- Started arbitrator node 1 [ticket=e52c0001e3e746f6]
2015-07-18 12:01:21 [ndbd] INFO     -- Allocate event buffering page chunk in SUMA, 16 pages, first page ref = 1028003
2015-07-18 12:01:51 [ndbd] INFO     -- findNeighbours from: 5070 old (left: 3 right: 4) new (3 5)
2015-07-18 12:01:51 [ndbd] ALERT    -- Arbitration check won - node group majority
2015-07-18 12:01:51 [ndbd] INFO     -- President restarts arbitration thread [state=6]
2015-07-18 12:01:51 [ndbd] INFO     -- NR Status: node=4,OLD=Initial state,NEW=Node failed, fail handling ongoing
2015-07-18 12:01:51 [ndbd] INFO     -- DBTC instance 1: Starting take over of node 4
2015-07-18 12:01:51 [ndbd] INFO     -- DBTC instance 3: Inserting failed node 4 into takeover queue, length now=1
execGCP_NOMORETRANS(32615672/2) c_ongoing_take_over_cnt -> seize
2015-07-18 12:01:51 [ndbd] INFO     -- execGCP_NOMORETRANS(32615672/2) c_ongoing_take_over_cnt -> seize
DBTC instance 4: Inserting failed node 4 into takeover queue, length now=1
execGCP_NOMORETRANS(32615672/2) c_ongoing_take_over_cnt -> seize
2015-07-18 12:01:51 [ndbd] INFO     -- DBTC instance 2: Inserting failed node 4 into takeover queue, length now=1
NOT returning gcpTcfinished due to nfhandling 32615672/2
2015-07-18 12:01:51 [ndbd] INFO     -- DBTC instance 1: Completed take over of DBTC instance 0 in failed node 4, continuing with the next instance
2015-07-18 12:01:52 [ndbd] INFO     -- Error handler restarting system
2015-07-18 12:01:52 [ndbd] INFO     -- Error handler shutdown completed - exiting
2015-07-18 12:02:15 [ndbd] ALERT    -- Node 2: Forced node shutdown completed. Caused by error 2301: 'Assertion(Internal error, programming error or missing error messa
ge, please report a bug). Temporary error, restart node'.
2015-07-18 12:02:15 [ndbd] INFO     -- Ndb has terminated (pid 910636) restarting
[30 Jul 2015 10:07] MySQL Verification Team
Hello Carl Krumins,

Please do not submit the same bug more than once. An existing bug report Bug #77701 already describes this very problem. Even if you feel that your issue is somewhat different, the resolution is likely to be the same. Because of this, we hope you add your comments/logs to the original bug instead.

Thank you for your interest in MySQL.

Thanks,
Umesh