Bug #75527 Forced node shutdown completed...Initiated by signal 11.
Submitted: 16 Jan 2015 11:49 Modified: 22 Jul 2016 13:12
Reporter: Marek Knappe Email Updates:
Status: Can't repeat Impact on me:
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:mysql-5.6.21 ndb-7.3.7 OS:Linux (CentOS 5.8)
Assigned to: Bogdan Kecman CPU Architecture:Any
Tags: ndb;signal 11;crash;mysq;node shutdown

[16 Jan 2015 11:49] Marek Knappe
I have faced cluster shutdown after executing following queries:

mysql> select * from ec_dump2 into outfile '/data/backup/tmp/ec_dump2.sql';
Query OK, 4248932 rows affected (16 min 56.58 sec)
mysql> truncate ec_dump2;
ERROR 1296 (HY000): Got error 157 'Unknown error code' from NDBCLUSTER

From log I could observe this :
2015-01-16 19:20:21 [MgmtSrvr] INFO     -- Node 5: Event buffer status: used=77KB(0%) alloc=16040KB(0%) max=0B apply_epoch=34523303/4 latest_epoch=34523303/18
2015-01-16 19:27:38 [MgmtSrvr] ALERT    -- Node 3: Forced node shutdown completed. Occured during startphase 0. Initiated by signal 11.
2015-01-16 19:27:38 [MgmtSrvr] ALERT    -- Node 1: Node 3 Disconnected
2015-01-16 19:27:38 [MgmtSrvr] ALERT    -- Node 2: Forced node shutdown completed. Occured during startphase 0. Initiated by signal 11.
2015-01-16 19:27:38 [MgmtSrvr] ALERT    -- Node 1: Node 2 Disconnected

Here are necessary logs from bot nodes:

Please note time diff between bot data nodes:
[root@yt-db001 ~]# date
Fri Jan 16 21:50:02 EST 2015
[root@vt-db002 ~]# date
Fri Jan 16 21:56:53 EST 2015

[root@vm-ytlb001 mysql-cluster]# ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
[ndbd(NDB)]     2 node(s)
id=2    @  (mysql-5.6.21 ndb-7.3.7, Nodegroup: 0, *)
id=3    @  (mysql-5.6.21 ndb-7.3.7, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @  (mysql-5.6.21 ndb-7.3.7)

[mysqld(API)]   2 node(s)
id=4    @  (mysql-5.6.21 ndb-7.3.7)
id=5    @  (mysql-5.6.21 ndb-7.3.7)

Please note that after cluster shutdown I have started cluster from mysql user and you could probably see that in logs also. Had to switch for root as it was started from root previously.

How to repeat:
Can't test it as this cluster is in "production".
[16 Jan 2015 12:08] Marek Knappe
I would like to add also that on this cluster we are unable to perform any backup which is also critical for us:

Connected to Management Server at: localhost:1186
Waiting for completed, this may take several minutes
Node 2: Backup 11 started from 1 has been aborted. Error: 4237
Backup failed
*  3001: Could not start backup
*        Too many triggers: Permanent error: Application error
[17 Mar 2015 11:19] Marek Knappe
Hi Umesh,
Any progress on this?
[1 Jun 2015 8:51] Umesh Shastry
Sorry, I'm not able to reproduce this issue at my end.
I'll check internally if this has been fixed recently in 7.3.9 builds and then I would update you this.
[21 Jul 2015 12:40] Marek Knappe
Thanks for Info. Looking forward for more news.
[22 Jul 2016 13:12] Bogdan Kecman

reviewing this bug after some time passed.

I cannot reproduce this on 
 - 7.0.37
 - 7.1.34
 - 7.3.13
 - 7.4.11

Did you upgraded your system maybe recently and did you encountered this same bug on the new release? I'm unable to grab your log from dropbox so if you did encounter this same error on one of the releases I can't reproduce it on please upload the full logs again

kind regards
Bogdan Kecman