Bug #11050 ndb_mgm "show" prints incorrectly after master data node fails
Submitted: 2 Jun 2005 18:07 Modified: 13 Jun 2005 17:25
Reporter: Jonathan Miller Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.1.0, 4.1, 5.0 OS:Linux (Linux)
Assigned to: Tomas Ulin CPU Architecture:Any

[2 Jun 2005 18:07] Jonathan Miller
Description:
Doing a single data node failure test, I selected the master data node to run the kill -9 command against. The cluster stayed up with node 3 taking over as the master data node. But running the "show" command in ndb_mgm after the takeover none of the data nodes showed to be master.

before:
ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=2    @10.100.1.93  (Version: 5.1.0, Nodegroup: 0, Master)
id=3    @10.100.1.94  (Version: 5.1.0, Nodegroup: 0)
id=4    @10.100.1.93  (Version: 5.1.0, Nodegroup: 1)
id=5    @10.100.1.94  (Version: 5.1.0, Nodegroup: 1)

During kill:
ndbdev@ndb08:~/jmiller/builds/run> kill -9 10803
ndbdev@ndb08:~/jmiller/builds/run> kill -9 10804
ndbdev@ndb08:~/jmiller/builds/run> ERROR: 270 Transaction aborted due to node shutdown
           Status: Temporary error, Classification: Node shutdown
           File: Bank.cpp (Line: 1579)
getOldestNotPurgedGL failed
ERROR: 266 Time-out in NDB, probably caused by deadlock
           Status: Temporary error, Classification: Timeout expired
           File: Bank.cpp (Line: 2279)
performTransaction returned NDBT_FAILED
  fromAccount = 2
  toAccount = 1
  amount = 7306

After:
ndb_mgm> show
Connected to Management Server at: ndb08:14000
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=2 (not connected, accepting connect from ndb08)
id=3    @10.100.1.94  (Version: 5.1.0, Nodegroup: 0)
id=4    @10.100.1.93  (Version: 5.1.0, Nodegroup: 1)
id=5    @10.100.1.94  (Version: 5.1.0, Nodegroup: 1)

Repeated "show" commands + exiting and restarting ndb_mgm did not change the print out.

How to repeat:
Start a 4 data node cluster.
Login into ndb_mgm do show. 
Find which data node process is marked as master. 
Kill -9 that PID. 
Login into ndb_mgm do show.

Suggested fix:
"Show" command should always know with data node is master.
[2 Jun 2005 19:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/25533
[2 Jun 2005 19:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/25534
[13 Jun 2005 17:25] Jon Stephens
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Documented in Change History for versions 4.1.13 and 5.0.8. Closed.