Bug #57568 Data node showing as 'not connected' when allocating lots of memory
Submitted: 19 Oct 2010 12:32 Modified: 6 Jan 2011 5:54
Reporter: Geert Vanderkelen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-7.1 OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any
Tags: 7.1

[19 Oct 2010 12:32] Geert Vanderkelen
Description:
When a data node starts it allocates memory and then connects to the ndb_mgmd to do heartbeating and exchange messages. (obviously it has to connect first to ndb_mgmd to fetch configuration)
This is fine when data memory is relatively small, and allocation is fast. However, when a huge amount of memory is being allocated, it might result in not knowing what's going on. In the ndb_mgm client tool, the data node will still show as not connected or running in a certain phase (which is not true).

How to repeat:
.
[20 Dec 2010 9:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/127275

4066 Jonas Oreland	2010-12-20
      ndb - bug#57568
        This patch makes 7.0 behave same as 6.3 wrt. really big allocations.
        The core idea of patch is to alloc all memory (still) at startup
          map memory needed for startup (job buffers + transporter buffers)
          and then let CMVMI via NDBFS map rest
      
        (map = touch each page and optionally memlock)
[20 Dec 2010 9:49] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.21 (revid:jonas@mysql.com-20101220094736-7uh9mlcqkmioufgo) (version source revid:jonas@mysql.com-20101220094736-7uh9mlcqkmioufgo) (merge vers: 5.1.51-ndb-7.0.21) (pib:24)
[20 Dec 2010 9:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/127277

4067 Jonas Oreland	2010-12-20
      ndb - bug#57568
        This patch makes 7.0 behave same as 6.3 wrt. really big allocations.
        The core idea of patch is to alloc all memory (still) at startup
          map memory needed for startup (job buffers + transporter buffers)
          and then let CMVMI via NDBFS map rest
      
        (map = touch each page and optionally memlock)
[20 Dec 2010 9:52] Jonas Oreland
pushed to 7.0.21 and 7.1.10
[6 Jan 2011 5:54] Jon Stephens
Documented bugfix as follows in the NDB-7.0.21 and 7.1.10 changelogs:

        Data nodes no longer allocated all memory prior to being ready
        to exchange heartbeat and other messages with management nodes,
        as in NDB 6.3 and earlier versions of MySQL Cluster. This caused
        problems when data nodes configured with large amounts of memory
        failed to show as connected or showed as being in the wrong
        start phase in the ndb_mgm client even after making their
        initial connections to and fetching their configuration data
        from the management server. With this fix, data nodes now
        allocate all memory as they did in earlier MySQL Cluster
        versions.

Closed.