Bug #10580 Got error 4009 'Cluster Failure' from NDB
Submitted: 12 May 2005 10:20 Modified: 2 Sep 2005 12:22
Reporter: Joerg Bruehe Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:4.1.12 + 5.0 + 5.1 OS:Other (IRIX + Linux(x86))
Assigned to: Assigned Account CPU Architecture:Any

[12 May 2005 10:20] Joerg Bruehe
Description:
Release build of 4.1.12, based on ChangeSet
  1.2261 05/05/10 14:44:03 lenz@mysql.com +2 -0
  Applied two patches already commited to the main tree to resolve bugs found during the build:
   - removed an "#error PRAGMA" from sql/item_strfunc.h (msvensson)
   - Fixed BUG#10070 (range test failed if InnoDB is not available) (sergefp)

Test failure on "octane2" (IRIX, both 32 and 64 bit) in "max":
-------------------------------------------------------
*** r/func_concat.result        Wed May 11 00:05:08 2005
--- r/func_concat.reject        Wed May 11 02:56:01 2005
***************
*** 1,5 ****
--- 1,7 ----
  DROP TABLE IF EXISTS t1;
  CREATE TABLE t1 ( number INT NOT NULL, alpha CHAR(6) NOT NULL );
+ Warnings:
+ Error 1296    Got error 4009 'Cluster Failure' from NDB
  INSERT INTO t1 VALUES (1413006,'idlfmv'),
  (1413065,'smpsfz'),(1413127,'sljrhx'),(1413304,'qerfnd');
  SELECT number, alpha, CONCAT_WS('<---->',number,alpha) AS new
***************
*** 27,32 ****
--- 29,36 ----
  1413006       idlfmv  1413006<------------------>idlfmv
  drop table t1;
  create table t1 (a char(4), b double, c date, d tinyint(4));
+ Warnings:
+ Error 1296    Got error 4009 'Cluster Failure' from NDB
  insert into t1 values ('AAAA', 105, '2003-03-01', 1);
  select * from t1 where concat(A,C,B,D) = 'AAAA2003-03-011051';
  a     b       c       d
-------------------------------------------------------

The 64 bit binary had it in "default" test run, the 32 bit one in "--ps-protocol", the other test passed.

Very similar symptoms in today's "autobuild" test mails for 5.0 (test "rpl_session_var") and 5.1 (test "rpl_change_master"), so it seems to be a newly introduced problem showing up only sporadic.

How to repeat:
Test suite may do it.
[12 May 2005 12:51] Joerg Bruehe
More failures with the same symptom:

hpita2-64bit, "cluster", test "rpl_temporary", PS protocol: 2 times, following a "create table t1(f int);" (second: table name "t2");
hpux11-64bit, "cluster", test "flush_table", PS protocol: 4 times, each following a "create table";
octane2-64bit, "cluster", test "func_concat", PS protocol (as above);
production-icc, "cluster", test "alias", PS protocol: following the very first "create table" (which specifies "ENGINE=MyISAM") and another "create table";
[2 Aug 2005 12:22] Martin Skold
Cluster failure can be due to many things, when not running
any transactions directly against cluster and still getting failures
the most likely cause is resource shortage (out of memory) or system
overload (causing heartbeat failures in internal communication between
ndbd processes running on same machine).
Always attach log files from cluster for us to be able to analyse what
the cause was.
The cluster log, see:
http://dev.mysql.com/doc/mysql/en/ndb-mgmd-process.html
The error log and related trace log for individual nodes, see:
http://dev.mysql.com/doc/mysql/en/ndbd-process.html
[2 Sep 2005 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".