Bug #43108 ndbmtd - online add node fails
Submitted: 23 Feb 2009 13:27 Modified: 14 Mar 2009 12:42
Reporter: Johan Andersson Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.4 OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any
Tags: ndbmtd, online add node

[23 Feb 2009 13:27] Johan Andersson
Description:
I have two data nodes running:

Connected to Management Server at: localhost:1186
Node 3: started (mysql-5.1.32 ndb-6.4.3)
Node 4: started (mysql-5.1.32 ndb-6.4.3)
Node 5: not connected
Node 6: not connected

Now I want to add 5 and 6:
5: ndbmtd --initial
6: ndbmtd --initial

[root@ps-ndb01 tools]# ndb_mgm -e "all status"
Connected to Management Server at: localhost:1186
Node 3: started (mysql-5.1.32 ndb-6.4.3)
Node 4: started (mysql-5.1.32 ndb-6.4.3)
Node 5: starting (Last completed phase 3) (mysql-5.1.32 ndb-6.4.3)
Node 6: starting (Last completed phase 1) (mysql-5.1.32 ndb-6.4.3)

When node 5 says "last completed phase is 4" and
node 6 says "last completed phase is 2" 

[root@ps-ndb01 tools]# ndb_mgm -e "all status"
Connected to Management Server at: localhost:1186
Node 3: started (mysql-5.1.32 ndb-6.4.3)
Node 4: started (mysql-5.1.32 ndb-6.4.3)
Node 5: starting (Last completed phase 4) (mysql-5.1.32 ndb-6.4.3)
Node 6: starting (Last completed phase 2) (mysql-5.1.32 ndb-6.4.3)

Then cluster is no more:
[root@ps-ndb01 tools]# ndb_mgm -e "all status"
Connected to Management Server at: localhost:1186
Node 3: not connected
Node 4: not connected
Node 5: not connected
Node 6: not connected

How to repeat:
* online add node with ndbmtd

Suggested fix:
fix the bug :)
[23 Feb 2009 13:32] Johan Andersson
trace files
ftp://ftp.mysql.com/pub/mysql/upload/ndb_error_report_20090223141128.tar.bz2
[12 Mar 2009 6:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/68994

2934 Jonas Oreland	2009-03-12
      ndb - bug#43108 - part I - remove LocalProxy::Node, send LCP_*_REP using local DIH instead
[12 Mar 2009 6:53] Bugs System
Pushed into 5.1.32-ndb-7.0.4 (revid:jonas@mysql.com-20090312065239-3exz40e2dnn5y5lb) (version source revid:jonas@mysql.com-20090312065239-3exz40e2dnn5y5lb) (merge vers: 5.1.32-ndb-7.0.4) (pib:6)
[13 Mar 2009 7:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/69093

2937 Jonas Oreland	2009-03-13
      ndb - bug#43108 - redo DblqhProxy handling of LCP
[13 Mar 2009 8:18] Bugs System
Pushed into 5.1.32-ndb-7.0.4 (revid:jonas@mysql.com-20090313081433-xri9mdj8hb2dd6pt) (version source revid:jonas@mysql.com-20090313074942-56mp5oyxwurwx0t9) (merge vers: 5.1.32-ndb-7.0.4) (pib:6)
[14 Mar 2009 12:42] Jon Stephens
Documented bugfix in the NDB-7.0.4 changelog as follows:

        It was not possible to add new data nodes to the cluster online 
        using multi-threaded data node processes (ndbmtd).

Also noted in "Adding MySQL Cluster Data Nodes Online".