MySQL Bugs: #17614: DD: SR fails when having logfilegroups wo/ undofiles

Bug #17614	DD: SR fails when having logfilegroups wo/ undofiles
Submitted:	21 Feb 2006 14:06	Modified:	19 Feb 2009 15:58
Reporter:	Jonathan Miller	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	5.1.8	OS:	Linux (Linux 64 bit OS)
Assigned to:	Jonas Oreland	CPU Architecture:	Any

Description:
Over the weekend (Firday night CST), Nikolay and I stated a stress test using TPCB scripts.

Sunday I introduced 2 new scripts that did the following:

Loop{
Log in to cluster;
Create Database;
Create Log Group;
Create Table Space;
Create Table;
Insert Data;
Delete Data;
Drop Table;
Drop Table Space;
Drop Log Group;
Drop Database;
}

All was working well. Monday I had found that we ran out of disk space and that one of the data nodes had failed. The other was up but was spining on being out of undo space and aborting any and all transaction. I was still able to connect to TPCB database and do queries.

In an effort to recover, I moved some of the disk data to a different drive and created symbolic links to them and restarted the data node.

The data node came up and never got past phase 4. In the ndb_1_cluster.log it showed that the data node had completed phase 4, but a "3 status" in the managment console showed that it was still in phase 4.

After leaving it in phase 4 for a couple of hours yesterday, I issues a "3 restart". Checking it this morning I found that it was still in phase 4. Since both of the attemps to restart had failed, I decided to "shutdown" and restart the entire cluster. On cluster restart, the other data node crashed with the following error log:

Time: Tuesday 21 February 2006 - 13:41:18
Status: Temporary error, restart node
Message: Assertion (Internal error, programming error or missing error message, please report a bug)
Error: 2301
Error data: ArrayPool<T>::getPtr
Error object: ../../../../../storage/ndb/src/kernel/vm/ArrayPool.hpp line: 378 (block: LGMAN)
Program: /home/ndbdev/ngrishakin/builds/libexec/ndbd
Pid: 16800
Trace: /space/run/ndb_2_trace.log.1
Version: Version 5.1.8 (beta)
***EOM***

Attached is a file from ndb_error_reporter, but due to bug in this script, FS is not include.

How to repeat:
Not easy to repeat

Tomas, not sure why you assigned to me and changed to open. I see no comments to tell me why. So setting to not assigned and back to verified.

Thanks
JBM

Jeb,

It was I who set it to open, didnt mean to reassign it to you...
I'll change it to analysing...

Changed title to reflect bug
Removed show stopper flag
Simpler testcase (from ndbapi):
1) create logfile group (note wo/ undofiles)
2) do SR

To do this from SQL you need luck :-)

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/66914

2848 Jonas Oreland	2009-02-19
      ndb - bug#17614 - handle logfile groups wo/ undofiles during restart

Pushed into 5.1.32-ndb-6.2.17 (revid:jonas@mysql.com-20090219142410-r6qourg3yo4bsgid) (version source revid:jonas@mysql.com-20090219142410-r6qourg3yo4bsgid) (merge vers: 5.1.32-ndb-6.2.17) (pib:6)

Pushed into 5.1.32-ndb-6.3.23 (revid:jonas@mysql.com-20090219142730-cn9buezdjdopcis3) (version source revid:jonas@mysql.com-20090219142730-cn9buezdjdopcis3) (merge vers: 5.1.32-ndb-6.3.23) (pib:6)

Pushed into 5.1.32-ndb-6.4.3 (revid:jonas@mysql.com-20090219142840-v5q5ros6m6q2p7i5) (version source revid:jonas@mysql.com-20090219142840-v5q5ros6m6q2p7i5) (merge vers: 5.1.32-ndb-6.4.3) (pib:6)

Documented bugfix in the NDB-6.2.17, 6.3.23, and 6.4.3 changelogs as follows:

        Attempting to perform a system restart of the cluster where
        there existed a logfile group without and undo log files caused
        the data nodes to crash.

        Note: While issuing a CREATE LOGFILE GROUP statement without 
        an ADD UNDOFILE option fails with an error in the MySQL server, 
        this situation could arise if an SQL node failed while executing 
        a valid CREATE LOGFILE GROUP statement; it is also possible to
        create a logfile group without any undo log files using the NDB
        API.