Bug #17614 DD: SR fails when having logfilegroups wo/ undofiles
Submitted: 21 Feb 2006 14:06 Modified: 19 Feb 2009 15:58
Reporter: Jonathan Miller Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.1.8 OS:Linux (Linux 64 bit OS)
Assigned to: Jonas Oreland CPU Architecture:Any

[21 Feb 2006 14:06] Jonathan Miller
Description:
Over the weekend (Firday night CST), Nikolay and I stated a stress test using TPCB scripts.

Sunday I introduced 2 new scripts that did the following:

Loop{
 Log in to cluster;
 Create Database;
 Create Log Group;
 Create Table Space;
 Create Table;
 Insert Data;
 Delete Data;
 Drop Table;
 Drop Table Space;
 Drop Log Group;
 Drop Database;
}

All was working well. Monday I had found that we ran out of disk space and that one of the data nodes had failed. The other was up but was spining on being out of undo space and aborting any and all transaction. I was still able to connect to TPCB database and do queries.

In an effort to recover, I moved some of the disk data to a different drive and created symbolic links to them and restarted the data node.

The data node came up and never got past phase 4. In the ndb_1_cluster.log it showed that the data node had completed phase 4, but a "3 status" in the managment console showed that it was still in phase 4.

After leaving it in phase 4 for a couple of hours yesterday, I issues a "3 restart". Checking it this morning I found that it was still in phase 4. Since both of the attemps to restart had failed, I decided to "shutdown" and restart the entire cluster. On cluster restart, the other data node crashed with the following error log:

Time: Tuesday 21 February 2006 - 13:41:18
Status: Temporary error, restart node
Message: Assertion (Internal error, programming error or missing error message, please report a bug)
Error: 2301
Error data: ArrayPool<T>::getPtr
Error object: ../../../../../storage/ndb/src/kernel/vm/ArrayPool.hpp line: 378 (block: LGMAN)
Program: /home/ndbdev/ngrishakin/builds/libexec/ndbd
Pid: 16800
Trace: /space/run/ndb_2_trace.log.1
Version: Version 5.1.8 (beta)
***EOM***

Attached is a file from ndb_error_reporter, but due to bug in this script, FS is not include. 

How to repeat:
Not easy to repeat
[23 Feb 2006 12:44] Jonathan Miller
Tomas, not sure why you assigned to me and changed to open. I see no comments to tell me why. So setting to not assigned and back to verified.

Thanks
JBM
[23 Feb 2006 12:54] Jonas Oreland
Jeb,

It was I who set it to open, didnt mean to reassign it to you...
I'll change it to analysing...
[5 Mar 2006 7:06] Jonas Oreland
Changed title to reflect bug
Removed show stopper flag
Simpler testcase (from ndbapi):
1) create logfile group (note wo/ undofiles)
2) do SR

To do this from SQL you need luck :-)
[19 Feb 2009 14:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/66914

2848 Jonas Oreland	2009-02-19
      ndb - bug#17614 - handle logfile groups wo/ undofiles during restart
[19 Feb 2009 14:29] Bugs System
Pushed into 5.1.32-ndb-6.2.17 (revid:jonas@mysql.com-20090219142410-r6qourg3yo4bsgid) (version source revid:jonas@mysql.com-20090219142410-r6qourg3yo4bsgid) (merge vers: 5.1.32-ndb-6.2.17) (pib:6)
[19 Feb 2009 14:30] Bugs System
Pushed into 5.1.32-ndb-6.3.23 (revid:jonas@mysql.com-20090219142730-cn9buezdjdopcis3) (version source revid:jonas@mysql.com-20090219142730-cn9buezdjdopcis3) (merge vers: 5.1.32-ndb-6.3.23) (pib:6)
[19 Feb 2009 14:30] Bugs System
Pushed into 5.1.32-ndb-6.4.3 (revid:jonas@mysql.com-20090219142840-v5q5ros6m6q2p7i5) (version source revid:jonas@mysql.com-20090219142840-v5q5ros6m6q2p7i5) (merge vers: 5.1.32-ndb-6.4.3) (pib:6)
[19 Feb 2009 15:58] Jon Stephens
Documented bugfix in the NDB-6.2.17, 6.3.23, and 6.4.3 changelogs as follows:

        Attempting to perform a system restart of the cluster where
        there existed a logfile group without and undo log files caused
        the data nodes to crash.

        Note: While issuing a CREATE LOGFILE GROUP statement without 
        an ADD UNDOFILE option fails with an error in the MySQL server, 
        this situation could arise if an SQL node failed while executing 
        a valid CREATE LOGFILE GROUP statement; it is also possible to
        create a logfile group without any undo log files using the NDB
        API.