Bug #95473 data node and sql node crash (failed ndbrequire)
Submitted: 22 May 10:40 Modified: 21 Jun 8:25
Reporter: Hendrik Woltersdorf Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:7.4.23 OS:CentOS (6.4)
Assigned to: CPU Architecture:x86
Tags: ndbcluster ndbrequire

[22 May 10:40] Hendrik Woltersdorf
Description:
We saw a data node crashing and following a sql node crashing.
data node:
Message: Internal program error (failed ndbrequire) (Internal error, programming error or missing error message, please report a bug)
Error: 2341
Error data: DbtcMain.cpp
Error object: DBTC (Line: 13503) 0x00000002
Program: ndbmtd
Pid: 8521 thr: 8

sql node:
...
2019-05-20 08:45:26 32414 [ERROR] Got error 270 when reading table './hacom/SERVER_CONFIG'
2019-05-20 08:45:26 32414 [ERROR] /opt/mysql/bin/mysqld: Got temporary error 270 'Transaction aborted due to node shutdown' from NDBCLUSTER
2019-05-20 08:45:26 32414 [ERROR] Got error 270 when reading table './hacom/SERVER_CONFIG'
2019-05-20 08:45:26 32414 [ERROR] /opt/mysql/bin/mysqld: Sort aborted: Got temporary error 270 'Transaction aborted due to node shutdown' from NDBCLUSTER
2019-05-20 08:45:26 32414 [ERROR] Got error 270 when reading table './hacom/SERVER_CONFIG'
2019-05-20 08:45:26 32414 [ERROR] Got error 270 when reading table './hacom/SERVER_CONFIG'
2019-05-20 08:45:26 32414 [ERROR] Got error 270 when reading table './hacom/SERVER_CONFIG'
2019-05-20 08:45:26 32414 [ERROR] Got error 270 when reading table './hacom/SERVER_CONFIG'
2019-05-20 08:45:26 32414 [ERROR] Got error 270 when reading table './hacom/SERVER_CONFIG'
2019-05-20 08:45:26 32414 [ERROR] Got error 270 when reading table './hacom/SERVER_CONFIG'
2019-05-20 08:45:26 32414 [ERROR] Got error 270 when reading table './trax/EXP_EXECUTION'
2019-05-20 08:45:26 32414 [ERROR] Got error 270 when reading table './hacom/SERVER_CONFIG'
2019-05-20 08:45:26 32414 [ERROR] Got error 270 when reading table './hacom/SERVER_CONFIG'
2019-05-20 08:45:27 32414 [ERROR] Got error 4010 when reading table './hacom/SERVER_REPOSITORY'
2019-05-20 08:45:27 32414 [ERROR] Got error 4010 when reading table './hacom/SERVER_REPOSITORY'
2019-05-20 08:45:27 32414 [ERROR] Got error 4010 when reading table './hacom/SERVER_REPOSITORY'
2019-05-20 08:45:27 32414 [ERROR] Got error 4010 when reading table './hacom/SERVER_REPOSITORY'
2019-05-20 08:45:27 32414 [ERROR] Got error 4010 when reading table './hacom/SERVER_REPOSITORY'
2019-05-20 08:45:27 32414 [ERROR] Got error 4010 when reading table './hacom/SERVER_REPOSITORY'
2019-05-20 08:45:27 32414 [ERROR] Got error 4010 when reading table './hacom/SERVER_REPOSITORY'
2019-05-20 08:45:27 32414 [ERROR] Got error 4010 when reading table './hacom/SERVER_REPOSITORY'
2019-05-20 08:45:27 32414 [ERROR] Got error 4010 when reading table './hacom/SERVER_REPOSITORY'
2019-05-20 08:45:27 32414 [ERROR] Got error 4010 when reading table './hacom/SERVER_REPOSITORY'
2019-05-20 08:45:27 32414 [ERROR] Got error 4010 when reading table './hacom/SERVER_REPOSITORY'
2019-05-20 08:45:28 32414 [Note] NDB Schema dist: Data node: 11 failed, subscriber bitmask 000000000
06:45:29 UTC - mysqld got signal 11 ;
...

How to repeat:
Earlier that day we had problems with blocked connections, because an app did "alter table" during startup, while itself and other instances queried the same table. I had to kill the connections.
I'm not shure, if this is the cause for the crash, but it's the only abnormal thing at that day.
[28 May 5:29] Bogdan Kecman
Hi,

Can't say anything from this log snippet, I need a full ndb_error_reporter logs in order to see what happened, what crashed, where and why.

Thanks
Bogdan
[28 May 5:53] Hendrik Woltersdorf
Now I uploaded the files from ndb_error_reporter in 3 parts, but removed old entries (e.g. before 2019-05-01) from the large log files, to make uploading here possible.
Our corporate firewall blocks access to sftp.oracle.com:2021.
[28 May 6:11] Bogdan Kecman
Hi,

Thanks, this should be enough.

all best
Bogdan