Bug #66495 MySQL Cluster node failure with no activity
Submitted: 22 Aug 2012 9:31 Modified: 22 Jul 2016 12:53
Reporter: Joffrey MICHAIE Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:7.2.6 OS:Linux (SLES12 (using SLES11 package))
Assigned to: MySQL Verification Team CPU Architecture:Any

[22 Aug 2012 9:31] Joffrey MICHAIE
Description:
Hi, 

Cluster running with no activity :

2012-08-21 18:35:22 [ndbd] INFO     -- findNeighbours from: 4861 old (left: 3 right: 3) new (65535 65535)
REMOVING lcp: 98 from table: 2 frag: 0 node: 3
REMOVING lcp: 98 from table: 2 frag: 1 node: 3
REMOVING lcp: 98 from table: 2 frag: 2 node: 3
REMOVING lcp: 98 from table: 2 frag: 3 node: 3
REMOVING lcp: 98 from table: 2 frag: 4 node: 3
REMOVING lcp: 98 from table: 2 frag: 5 node: 3
REMOVING lcp: 98 from table: 2 frag: 6 node: 3
start_resend(0, empty bucket (92682/4 92682/3) -> activeREMOVING lcp: 98 from table: 2 frag: 7 node: 3

Finished with handling node-failure
execGCP_NOMORETRANS(92682/4) c_ongoing_take_over_cnt -> seize
2012-08-21 18:35:22 [ndbd] INFO     -- Illegal signal received (GSN 36 not added)
2012-08-21 18:35:22 [ndbd] INFO     -- Illegal signal received (GSN 36 not added)
2012-08-21 18:35:22 [ndbd] INFO     -- Error handler shutting down system
2012-08-21 18:35:22 [ndbd] INFO     -- Error handler shutdown completed - exiting
2012-08-21 18:35:23 [ndbd] ALERT    -- Node 4: Forced node shutdown completed. Caused by error 2301: 'Assertion(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

Error message:

Time: Tuesday 21 August 2012 - 18:35:21
Status: Temporary error, restart node
Message: Assertion (Internal error, programming error or missing error message, please report a bug)
Error: 2301
Error data: Illegal signal received (GSN 36 not added)
Error object: Illegal signal received (GSN 36 not added)
Program: ndbmtd
Pid: 1710 thr: 0
Version: mysql-5.5.22 ndb-7.2.6
Trace: /usr/local/mysql/data/ndb_3_trace.log.2 [t1..t8]
***EOM***

Trace :

NDBFS   001546 001379 
NDBFS   001518 001520 001379 001537 
QMGR    000121 000145 002626 002668 002677 

--------------- Signal ----------------
r.bn: 245 "DBTC", r.proc: 3, r.sigId: 16531317 gsn: 36 "TCRELEASEREQ" prio: 1
s.bn: 32774 "API", s.proc: 21, s.sigId: 0 length: 3 trace: 1 #sec: 0 fragInf: 0
 H'00000093 H'80060015 H'00000004

Source : ./storage/ndb/src/kernel/blocks/qmgr/QmgrMain.cpp
2667   if (cactivateApiCheck != 0) {
2668     jam();
2669     if (clatestTransactionCheck == 0) {
2670       //-------------------------------------------------------------
2671       // Initialise the Transaction check timer.
2672       //-------------------------------------------------------------
2673       clatestTransactionCheck = TcurrentTime;
2674     }//if
2675     int counter = 0;
2676     while (TcurrentTime > ((NDB_TICKS)10 + clatestTransactionCheck)) {
2677       jam();                                                                                                                                                                          
2678       clatestTransactionCheck += (NDB_TICKS)10;
2679       sendSignal(DBTC_REF, GSN_TIME_SIGNAL, signal, 1, JBB);
2680       sendSignal(DBLQH_REF, GSN_TIME_SIGNAL, signal, 1, JBB);
2681       counter++;
2682       if (counter > 1) {
2683         jam();
2684         break;
2685       } else {
2686         ;
2687       }//if
2688     }//while
2689   }//if
2690 

I have no clue on how to reproduce it, or investigate it further

I am not using 7.2.7 because of online backup problem.

How to repeat:
I have no clue on how to reproduce it, this happened 20 minutes after we stopped a sysbench mixed-oltp benchmark.
[22 Aug 2012 9:34] Joffrey MICHAIE
Ndb error report uploaded on oracle ftp :
mysql-bug-66495-ndb_error_report_20120822110220.tar.bz2
[22 Jul 2016 12:53] MySQL Verification Team
duplicate of 14609774
fixed in 7.0.37, 7.1.26, 7.2.10