Bug #12087 Cluster crashes
Submitted: 21 Jul 2005 19:14 Modified: 22 Jul 2005 20:49
Reporter: Anton Aleksandrov Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:5.0.9 OS:Linux (Linux Fedora 3 64-bit edition)
Assigned to: CPU Architecture:Any

[21 Jul 2005 19:14] Anton Aleksandrov
Description:
We have 8 data nodes (AMD 3000 64bit, 2Gb ram, 1Gbit lan), 3 api nodes (AMD 2800 64bit, 100Mbit lan), 1 mgm node (AMD 3000 64bit, 1Gbit lan). After heavy tests, there are the following lines in cluster's log:

2005-07-21 17:15:36 [MgmSrvr] WARNING  -- Node 9: Transporter to node 14 reported error 0x16
2005-07-21 17:15:36 [MgmSrvr] WARNING  -- Node 7: Transporter to node 14 reported error 0x16
2005-07-21 17:15:37 [MgmSrvr] WARNING  -- Node 7: Transporter to node 14 reported error 0x16 - Repeated 8 times
2005-07-21 17:15:37 [MgmSrvr] WARNING  -- Node 7: Transporter to node 12 reported error 0x16
2005-07-21 17:15:37 [MgmSrvr] WARNING  -- Node 7: Transporter to node 12 reported error 0x16 - Repeated 3 times
2005-07-21 17:15:37 [MgmSrvr] WARNING  -- Node 7: Transporter to node 13 reported error 0x16
2005-07-21 17:15:37 [MgmSrvr] WARNING  -- Node 7: Transporter to node 13 reported error 0x16 - Repeated 3 times
2005-07-21 17:15:52 [MgmSrvr] WARNING  -- Node 7: Transporter to node 14 reported error 0x16
2005-07-21 17:16:09 [MgmSrvr] WARNING  -- Node 7: Transporter to node 12 reported error 0x16
2005-07-21 17:16:10 [MgmSrvr] WARNING  -- Node 7: Transporter to node 12 reported error 0x16

(much more, than I show). And after for each node:

2005-07-21 20:45:53 [MgmSrvr] INFO     -- Node 10: Possible bug in Dbdih::execBLOCK_COMMIT_ORD c_blockCommit = 1 c_blockCommitNo = 7 sig->failNo =

After this all data nodes are disconnected and are down.

Here is my config.ini.

[NDBD DEFAULT]
NoOfReplicas=2
DataMemory=1850MB
IndexMemory=150Mb
MaxNoOfAttributes=15000
MaxNoOfOrderedIndexes=15000
MaxNoOfUniqueHashIndexes=10000
MaxNoOfConcurrentOperations=100000
RedoBuffer=64Mb
TimeBetweenLocalCheckpoints=13
##MaxNoOfConcurrentTransactions=10000
##MaxNoOfLocalOperations=2500000
NoOfFragmentLogFiles=150
[MYSQLD DEFAULT]
[NDB_MGMD DEFAULT]
DataDir=/var/lib/mysql-cluster
[TCP DEFAULT]
# Managment Server
[NDB_MGMD]
Id=1
HostName=192.168.1.10           # the IP of THIS SERVER
ArbitrationRank=1
[NDB_MGMD]
Id=2
HostName=192.168.1.30
ArbitrationRank=2
[NDB_MGMD]
Id=3
HostName=192.168.1.60
ArbitrationRank=2

# Storage Engines

[NDBD]
HostName=192.168.1.10
DataDir= /var/lib/mysql-cluster
[NDBD]
HostName=192.168.1.20
DataDir=/var/lib/mysql-cluster
[NDBD]
HostName=192.168.1.30
DataDir=/var/lib/mysql-cluster
[NDBD]
HostName=192.168.1.40
DataDir=/var/lib/mysql-cluster
[NDBD]
HostName=192.168.1.50
DataDir=/var/lib/mysql-cluster
[NDBD]
HostName=192.168.1.60
DataDir=/var/lib/mysql-cluster
[NDBD]
HostName=192.168.1.70
DataDir=/var/lib/mysql-cluster
[NDBD]
HostName=192.168.1.80
DataDir=/var/lib/mysql-cluster

# 2 MySQL Clients
# I personally leave this blank to allow rapid changes of the mysql clients;
# you can enter the hostnames of the above two servers here. I suggest you dont.
[MYSQLD]
HostName=192.168.1.241
[MYSQLD]
HostName=192.168.1.242
[MYSQLD]
HostName=192.168.1.243

[MYSQLD]
HostName=192.168.1.10
ArbitrationRank=0
[MYSQLD]
HostName=192.168.1.20
ArbitrationRank=0
[MYSQLD]
HostName=192.168.1.30
ArbitrationRank=0
[MYSQLD]
HostName=192.168.1.40
ArbitrationRank=0
[MYSQLD]
HostName=192.168.1.50
ArbitrationRank=0
[MYSQLD]
HostName=192.168.1.60
ArbitrationRank=0
[MYSQLD]
HostName=192.168.1.70
ArbitrationRank=0
[MYSQLD]
HostName=192.168.1.80
ArbitrationRank=0

How to repeat:
I am sorry, I don't know... 

Suggested fix:
Have no clue why this happens.
[22 Jul 2005 1:28] Stewart Smith
Reassign to correct category.
[22 Jul 2005 7:45] Anton Aleksandrov
Error log from Node 10, whcih caused everything to crash.

Attachment: ndb_10_error.log (application/octet-stream, text), 1.24 KiB.

[22 Jul 2005 7:46] Anton Aleksandrov
Trace log for Node 10, which caused everything to crash

Attachment: ndb_10_trace.log.zip (application/zip, text), 57.41 KiB.

[13 Mar 2014 13:34] Omer Barnir
This bug is not scheduled to be fixed at this time.