Bug #17536 DD: Node failure running load_tpcb.pl script.
Submitted: 17 Feb 2006 23:41 Modified: 14 Mar 2006 9:12
Reporter: Nikolay Grishakin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:5.1 OS:Linux (Linux)
Assigned to: Jonas Oreland CPU Architecture:Any

[17 Feb 2006 23:41] Nikolay Grishakin
Description:
Got the following error running load_tpcb.pl script:"Got temporary error 4010 'Node failure caused abort of transaction' from NDBCLUSTER at load_tpcb.pl line 400."
Test was running on ndb15 test system with two nodes located on ndb13 and ndb14.

ndb_3_error.log shows following info:
Message: System error, node killed during node restart by other node (Inter
ror or missing error message, please report a bug)
Error: 2303
Error data: Node 3 killed this node because GCP stop was detected
Error object: NDBCNTR (Line: 227) 0x0000000a
Program: /home/ndbdev/ngrishakin/builds/libexec/ndbd
Pid: 4166
Trace: /space/run/ndb_3_trace.log.4
Version: Version 5.1.8 (beta)
***EOM***

How to repeat:
see above
[17 Feb 2006 23:58] Nikolay Grishakin
Log files and trace files copied under ndb13:/space/bug17536/
[18 Feb 2006 1:33] Nikolay Grishakin
After the second run we got crush again. This time on ndb14.

ndb_2_error.log:

Message: System error, node killed during node restart by other node (Internal
ror or missing error message, please report a bug)
Error: 2303
Error data: Node 2 killed this node because GCP stop was detected
Error object: NDBCNTR (Line: 227) 0x0000000e
Program: /home/ndbdev/ngrishakin/builds/libexec/ndbd
Pid: 785
Trace: /space/run/ndb_2_trace.log.1
Version: Version 5.1.8 (beta)
***EOM***
[18 Feb 2006 7:58] Nikolay Grishakin
This problem happend with 1 replica. With two replicas failure goes away.
[23 Feb 2006 13:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/3065
[12 Mar 2006 12:47] Jonas Oreland
pushed into 5.1.8
the problem was that transaction waiting in log sync queue never was restared
this yeilded gcp stop.

The reasone for the transaction to stay in log sync queue, was an incorrect assumption
  to order between sync requests
[14 Mar 2006 9:12] Jon Stephens
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Documented bugfix in 5.1.8 changelog. Closed.