Bug #51693 Starting a node during a partitioned start causes node failure
Submitted: 3 Mar 2010 14:34 Modified: 24 Mar 2010 20:57
Reporter: Andrew Hutchings Email Updates:
Status: Verified Impact on me:
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.3 OS:Any
Assigned to: CPU Architecture:Any

[3 Mar 2010 14:34] Andrew Hutchings
Error message observed:

Time: Tuesday 26 January 2010 - 12:46:47
Status: Temporary error, restart node
Message: System error, node killed during node restart by other node (Internal
error, programming error or missing error message, please report a bug)
Error: 2303
Error data: Node 4 killed this node because it replied StartFragRef error code: 0.
Error object: NDBCNTR (Line: 257) 0x0000000a
Program: ndbd
Pid: 18920
Trace: /local/cudb/mysql/ndbd/data/ndb_4_trace.log.4
Version: mysql-5.1.37 ndb-6.3.27a-GA

How to repeat:
[24 Mar 2010 20:57] Andrew Hutchings
How to reproduce:

1. Start 2 data node cluster
2. Insert some data (I did about 100MB)
3. Stop data node 1
4. Drop the table and re-insert the data
5. Stop data node 2
6. Start data node 1 with --nowait-nodes=2 (the other data node)
7. When data node 1 has completed phase 4 start data node 2

This has to be a non-debug build.  Debug builds seem to hit various ndbasserts too easily when trying this test case.