Bug #60382 | all data node of mysql cluster was downed | ||
---|---|---|---|
Submitted: | 8 Mar 2011 1:28 | Modified: | 21 Mar 2016 21:55 |
Reporter: | ws lee | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | mysql-5.1.32 ndb-6.3.24 | OS: | Solaris (10) |
Assigned to: | MySQL Verification Team | CPU Architecture: | Any |
[8 Mar 2011 1:28]
ws lee
[8 Mar 2011 1:34]
ws lee
ndb_4_error.log Status: Temporary error, restart node Message: Error OS signal received (Internal error, programming error or missing error message, please report a bug) Error: 6000 Error data: Signal 11 received; Segmentation Fault Error object: main.cpp Program: /usr/local/mysql5.1.32-ndb6.3.24/bin/ndbd Pid: 3682 Trace: /var/lib/mysql-cluster5.1.32-ndb6.3.24/ndb_4_trace.log.1 Version: mysql-5.1.32 ndb-6.3.24-GA ***EOM*** ndb_5_error.log Status: Temporary error, restart node Message: Error OS signal received (Internal error, programming error or missing error message, please report a bug) Error: 6000 Error data: Signal 11 received; Segmentation Fault Error object: main.cpp Program: /usr/local/mysql5.1.32-ndb6.3.24/bin/ndbd Pid: 7548 Trace: /var/lib/mysql-cluster5.1.32-ndb6.3.24/ndb_5_trace.log.3 Version: mysql-5.1.32 ndb-6.3.24-GA ***EOM***
[21 Mar 2016 21:54]
MySQL Verification Team
Looking at the trace file shows: .... DBLQH 002693 DBTC 004152 DBTUP 010029 --------------- Signal ---------------- r.bn: 263 "API", r.proc: 4, r.sigId: 1997609488 gsn: 41 "Unknown" prio: 1 s.bn: 32774 "API", s.proc: 10, s.sigId: 0 length: 12 trace: 1 #sec: 0 fragInf: 0 H'00000037 H'00000000 H'00000000 H'8006000a H'00000000 H'ffffffff H'ffffffff H'00000000 H'00000100 H'00000000 H'00000000 H'00000000 ..... Here you can see the issue relates to the global signal number 41 being of unknown type. This means that either the gsn does not have an associated function (if it was a data node), or since it was an api node, it is more likely that there was some corruption of the signal. Both situations are fixed by an upgrade to at least 7.0 versions where gsn 41 is implemented (does not exist before that) and checksumming can be used to see if there is corruption happening on the network.