MySQL Bugs: #34207: Cluster will not restart, phase 3 takes 20 minutes then crash

Bug #34207	Cluster will not restart, phase 3 takes 20 minutes then crash
Submitted:	31 Jan 2008 20:45	Modified:	13 Mar 2009 8:45
Reporter:	Jeff Wang	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	5.1.22	OS:	Any
Assigned to:	Jonas Oreland	CPU Architecture:	Any

Description:

I'm using version 5.1.22 with on disk data.  After inserting some disk data, and I performed a full shutdown and restart of cluster.  Phase 3 takes 20 minutes and the nodes crash in phase 4.  Looking at the ndb logs, I see:

2008-01-31 12:29:32 [ndbd] WARNING  -- Ndb kernel is stuck in: Job Handling
2008-01-31 12:29:32 [ndbd] INFO     -- Watchdog: User time: 2129  System time: 3081
2008-01-31 12:29:55 [ndbd] INFO     -- You have found a bug! Failed op (INSERT) during REDO table: 1218 fragment: 0 err: 827
2008-01-31 12:29:55 [ndbd] INFO     -- DBLQH (Line: 15778) 0x0000000a

My setup is:

-2 node cluster
-1000 tables
-100,000 rows of 1KB on disk data

How to repeat:
not sure if this is generally reproducible or if it has something to do with my setup.

trace file

Attachment: trace.log (, text), 330.38 KiB.

> perror --ndb 827
NDB error code 827: Out of memory in Ndb Kernel, table data (increase DataMemory): Permanent error: Insufficient space

What does your "all dump 1000" look like before restart?

/jonas

I don't have the cluster up anymore as I completed nuked it and did a fresh setup with ndb --initial.  I have been able to do system and node restarts without any problems now.

Do you still have your cluster log file? ndb_1_cluster.log from your mgm node per chance? If so, please attach this.

ndb cluster mgm log

Attachment: ndb_1_cluster.log (, text), 4.00 KiB.

ndb_cluster mgm log (retrying)

Attachment: ndb_1_cluster.log (, text), 6.74 KiB.

827 can happen during NR
close as not a bug