MySQL Bugs: #52325: Node/cluster failure during mysqld startup can lead to abort

Bug #52325	Node/cluster failure during mysqld startup can lead to abort
Submitted:	24 Mar 2010 10:27	Modified:	29 Mar 2010 8:06
Reporter:	Jonas Oreland	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	mysql-5.1-telco-6.3	OS:	Any
Assigned to:	Jonas Oreland	CPU Architecture:	Any

Description:
If a node/cluster failure happens while mysqld is scanning ndb.ndb_schema table
(which it does when it manages to connect to cluster) there can be an crash
of mysqld due to insufficient error handling.

How to repeat:
1) create lots of tables
2) restart mysqld
3) restart ndb nodes while mysqld is starting

NOTE: this happens sporadically in autotest

Suggested fix:
don't abort, but retry

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/104166

3167 Jonas Oreland	2010-03-24
      ndb - bug#52325 - don't abort but handle errors in ndbcluster_find_all_databases

pushed to 6.3.33 and 7.0.14

Documented bugfix in the NDB-6.3.33, 7.0.14, and 7.1.3 changelogs, as follows;

        If a node or cluster failure occurred while mysqld was scanning
        the ndb.ndb_schema table (which it does when it attempting to
        connect to the cluster), insufficient error handling could lead
        to a crash by mysqld in certain cases. This could happen in a
        MySQL Cluster with a great many tables, when trying to restart
        data nodes while one or more mysqld processes were restarting.

Closed.