Bug #61929 1 of 3 sql nodes hangs during cluster startup
Submitted: 20 Jul 2011 13:55 Modified: 15 Mar 2016 16:13
Reporter: S McCarthy Email Updates:
Status: Not a Bug Impact on me:
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:mysql-5.1.51 ndb-7.1.9 OS:Linux (Ubuntu Natty)
Assigned to: Bogdan Kecman CPU Architecture:Any
Tags: datanode, hang, rolling restart, sqlnode, startup

[20 Jul 2011 13:55] S McCarthy
During a rolling restart (currently on hour 10 of stage 5) one of the three sql nodes hangs on all requests. Restarting the node made no difference. There are no schema changes.
The other sql nodes are functioning normally. (Cancelling the startup returns the third sql node to service.)

How to repeat:
Restart data node

Suggested fix:
All 3 sql nodes (identically configured) should behave the same.
[20 Jul 2011 15:26] S McCarthy
Moved some apps to one of the working sql nodes. The selects that were hanging on the bad node work (pdns, simple indexed queries) but the rails apps open with 'show tables' - that hangs, long after the app itself has been shut down again and the connection closed. (Sometime after 20-30 minutes those connections disappeared, not sure exactly when.)
[21 Jul 2011 19:00] S McCarthy
Restarting the other data node resulted in the same sql node hanging.

Possibly unrelated, but the rails apps doing 'show tables' and 'show fields' are hanging similarly on the 'working' sql nodes (but only those queries, the rest go through ok.) I wouldn't expect those to count as altering the schema, since they are read-only requests..
[15 Mar 2016 15:09] Bogdan Kecman

Do you still experience this problem?
This should not happen any more on any of the recent releases.
If you do experience this again, please attach the full ndb_error_reporter log

kind regards
Bogdan Kecman
[15 Mar 2016 16:04] S McCarthy
After 5 years? Really?

We migrated those apps to a (much more performant) Postgresql cluster. Then I changed jobs. Twice. (And in both cases, migrated apps from mysql over to postgresql. We're finishing up that process at my new position now.)
[15 Mar 2016 16:13] Bogdan Kecman

yes, after 5 years, sorry, that issue was solved long time ago, I'm just cleaning some old reports now

take care