MySQL Bugs: #21892: Cluster restart fails with disk tables

Bug #21892	Cluster restart fails with disk tables
Submitted:	29 Aug 2006 5:40	Modified:	30 Aug 2006 0:02
Reporter:	Jason Downing	Email Updates:
Status:	Duplicate	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S1 (Critical)
Version:	5.1.11	OS:	Linux (Debian 2.6.17 (32bit))
Assigned to:	Assigned Account	CPU Architecture:	Any
Tags:	failed ndb require, failed restart

Description:
Cluster starts up fine with an --initial. I can load data in from a backup file without problem. If the datafile has no disk based tables and I have not created any tablespace or datafiles on disk I can shutdown the cluster and restart without problem.

If I have disk data files and my backup file declares tables that use disk data storage, I can load my data into the cluster without problem, but if I shutdown the cluster it will not restart. This is what it says on ndb_mgm when I attempt a restart:

Node 3: Forced node shutdown completed, restarting. Occured during startphase 4. Initiated by signal 0. Caused by error 2301: 'Assertion(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
Node 2: Forced node shutdown completed, restarting. Occured during startphase 4. Initiated by signal 0. Caused by error 2301: 'Assertion(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

I can restart with --initial without problem.

Log files/tracelogs are attached.

Here is my config:

[NDBD DEFAULT]
NoOfReplicas=2
DataMemory=250M
IndexMemory=50M
MaxNoOfAttributes=3000
MaxNoOfConcurrentOperations=1000000
StartFailureTimeout=1000000
StartPartialTimeout=200000
LogLevelStartup=15
LogLevelShutdown=15
LogLevelStatistic=15
LogLevelCheckpoint=15
LogLevelNodeRestart=15
LogLevelConnection=15
LogLevelError=15
LogLevelInfo=15
StopOnError=N

[NDB_MGMD]
hostname=192.168.2.12
datadir=/var/lib/mysql-cluster
Id=1

[NDBD]
hostname=192.168.2.18
datadir=/usr/local/mysql/data
Id=2

[NDBD]
hostname=192.168.2.19
datadir=/usr/local/mysql/data
Id=3

[MYSQLD]
hostname=192.168.2.13
Id=4

These were the commands I used to create the tablespace:

CREATE LOGFILE GROUP lg_1
ADD UNDOFILE 'undo_1.dat'
INITIAL_SIZE 10M
UNDO_BUFFER_SIZE 2M
ENGINE NDB;

CREATE TABLESPACE ts_1
ADD DATAFILE 'data_1.dat'
USE LOGFILE GROUP lg_1
INITIAL_SIZE 1000M
ENGINE NDB;

ALTER TABLESPACE ts_1
ADD DATAFILE 'data_2.dat'
INITIAL_SIZE 1000M
ENGINE NDB;

I also filed bug 21172, which was a similar problem except that it occurred before any data was added, and only when there were disk data files present. Jonas supplied a patch for 21172, which I applied and tested. He also said a 2.6 kernel would fix the problem as well. I have verified that the 2.6 kernel fixes that problem, but it definitely does not fix the current problem. I also applied the patch for 21172 and tested it with a 2.6 kernel, and that did not fix this problem either.

The error log on a data node says this:

Error object: ../../../../../storage/ndb/src/kernel/vm/ArrayPool.hpp line: 379 (block: DBLQH)

So perhaps this is where the problem is. I have tried a few things like removing null values from a blob column and removing the blob column altogether. The results have been inconsistent, once it restarted without a blob column, and and once it failed to restart without a blob column. If the problem is in my data file I'm not sure where.

Thanks, Jason

How to repeat:
See below

Tracelogs/errorlogs

Attachment: 29-8-06 Bug.zip (application/zip, text), 67.79 KiB.

Hi,

this is a duplicate of http://bugs.mysql.com/bug.php?id=21271
which is fixed, and fix will hopefully make 5.1.12

Ok thanks for letting me know. You can consider this bug closed, I'll try the new version or maybe the patch and refile another bug if I have further problems.

Thanks, Jason