Bug #52325 Node/cluster failure during mysqld startup can lead to abort
Submitted: 24 Mar 2010 10:27 Modified: 29 Mar 2010 8:06
Reporter: Jonas Oreland Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.3 OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[24 Mar 2010 10:27] Jonas Oreland
Description:
If a node/cluster failure happens while mysqld is scanning ndb.ndb_schema table
(which it does when it manages to connect to cluster) there can be an crash
of mysqld due to insufficient error handling.

How to repeat:
1) create lots of tables
2) restart mysqld
3) restart ndb nodes while mysqld is starting

NOTE: this happens sporadically in autotest

Suggested fix:
don't abort, but retry
[24 Mar 2010 10:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/104166

3167 Jonas Oreland	2010-03-24
      ndb - bug#52325 - don't abort but handle errors in ndbcluster_find_all_databases
[25 Mar 2010 9:03] Jonas Oreland
pushed to 6.3.33 and 7.0.14
[29 Mar 2010 8:06] Jon Stephens
Documented bugfix in the NDB-6.3.33, 7.0.14, and 7.1.3 changelogs, as follows;

        If a node or cluster failure occurred while mysqld was scanning
        the ndb.ndb_schema table (which it does when it attempting to
        connect to the cluster), insufficient error handling could lead
        to a crash by mysqld in certain cases. This could happen in a
        MySQL Cluster with a great many tables, when trying to restart
        data nodes while one or more mysqld processes were restarting.

Closed.