Bug #17614 DD: SR fails when having logfilegroups wo/ undofiles
Submitted: 21 Feb 2006 15:06 Modified: 19 Feb 16:58
Reporter: Jonathan Miller
Status: Closed
Category:Server: Cluster Severity:S2 (Serious)
Version:5.1.8 OS:Linux (Linux 64 bit OS)
Assigned to: Jonas Oreland Target Version:

[21 Feb 2006 15:06] Jonathan Miller
Description:
Over the weekend (Firday night CST), Nikolay and I stated a stress test using TPCB
scripts.

Sunday I introduced 2 new scripts that did the following:

Loop{
 Log in to cluster;
 Create Database;
 Create Log Group;
 Create Table Space;
 Create Table;
 Insert Data;
 Delete Data;
 Drop Table;
 Drop Table Space;
 Drop Log Group;
 Drop Database;
}

All was working well. Monday I had found that we ran out of disk space and that one of
the data nodes had failed. The other was up but was spining on being out of undo space
and aborting any and all transaction. I was still able to connect to TPCB database and do
queries.

In an effort to recover, I moved some of the disk data to a different drive and created
symbolic links to them and restarted the data node.

The data node came up and never got past phase 4. In the ndb_1_cluster.log it showed that
the data node had completed phase 4, but a "3 status" in the managment console showed that
it was still in phase 4.

After leaving it in phase 4 for a couple of hours yesterday, I issues a "3 restart".
Checking it this morning I found that it was still in phase 4. Since both of the attemps
to restart had failed, I decided to "shutdown" and restart the entire cluster. On cluster
restart, the other data node crashed with the following error log:

Time: Tuesday 21 February 2006 - 13:41:18
Status: Temporary error, restart node
Message: Assertion (Internal error, programming error or missing error message, please
report a bug)
Error: 2301
Error data: ArrayPool<T>::getPtr
Error object: ../../../../../storage/ndb/src/kernel/vm/ArrayPool.hpp line: 378 (block:
LGMAN)
Program: /home/ndbdev/ngrishakin/builds/libexec/ndbd
Pid: 16800
Trace: /space/run/ndb_2_trace.log.1
Version: Version 5.1.8 (beta)
***EOM***

Attached is a file from ndb_error_reporter, but due to bug in this script, FS is not
include. 

How to repeat:
Not easy to repeat
[23 Feb 2006 13:44] Jonathan Miller
Tomas, not sure why you assigned to me and changed to open. I see no comments to tell me
why. So setting to not assigned and back to verified.

Thanks
JBM
[23 Feb 2006 13:54] Jonas Oreland
Jeb,

It was I who set it to open, didnt mean to reassign it to you...
I'll change it to analysing...
[5 Mar 2006 8:06] Jonas Oreland
Changed title to reflect bug
Removed show stopper flag
Simpler testcase (from ndbapi):
1) create logfile group (note wo/ undofiles)
2) do SR

To do this from SQL you need luck :-)
[19 Feb 15:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/66914

2848 Jonas Oreland	2009-02-19
      ndb - bug#17614 - handle logfile groups wo/ undofiles during restart
[19 Feb 15:29] Bugs System
Pushed into 5.1.32-ndb-6.2.17 (revid:jonas@mysql.com-20090219142410-r6qourg3yo4bsgid)
(version source revid:jonas@mysql.com-20090219142410-r6qourg3yo4bsgid) (merge vers:
5.1.32-ndb-6.2.17) (pib:6)
[19 Feb 15:30] Bugs System
Pushed into 5.1.32-ndb-6.3.23 (revid:jonas@mysql.com-20090219142730-cn9buezdjdopcis3)
(version source revid:jonas@mysql.com-20090219142730-cn9buezdjdopcis3) (merge vers:
5.1.32-ndb-6.3.23) (pib:6)
[19 Feb 15:30] Bugs System
Pushed into 5.1.32-ndb-6.4.3 (revid:jonas@mysql.com-20090219142840-v5q5ros6m6q2p7i5)
(version source revid:jonas@mysql.com-20090219142840-v5q5ros6m6q2p7i5) (merge vers:
5.1.32-ndb-6.4.3) (pib:6)
[19 Feb 16:58] Jon Stephens
Documented bugfix in the NDB-6.2.17, 6.3.23, and 6.4.3 changelogs as follows:

        Attempting to perform a system restart of the cluster where
        there existed a logfile group without and undo log files caused
        the data nodes to crash.

        Note: While issuing a CREATE LOGFILE GROUP statement without 
        an ADD UNDOFILE option fails with an error in the MySQL server, 
        this situation could arise if an SQL node failed while executing 
        a valid CREATE LOGFILE GROUP statement; it is also possible to
        create a logfile group without any undo log files using the NDB
        API.