Bug #20859 | ndbd node fails to recover | ||
---|---|---|---|
Submitted: | 5 Jul 2006 6:10 | Modified: | 7 Aug 2006 14:18 |
Reporter: | David Abbott | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S2 (Serious) |
Version: | 5.0.22 | OS: | Linux (Red Hat Linux ES4) |
Assigned to: | CPU Architecture: | Any |
[5 Jul 2006 6:10]
David Abbott
[5 Jul 2006 6:11]
David Abbott
ndbd and mysql logs
Attachment: ndblogs.tgz (application/octet-stream, text), 78.62 KiB.
[5 Jul 2006 6:13]
David Abbott
cluster config.ini
Attachment: config.ini (application/octet-stream, text), 767 bytes.
[5 Jul 2006 6:29]
David Abbott
cluster log file excerpt
Attachment: ndb_1_cluster.log (application/octet-stream, text), 22.57 KiB.
[5 Jul 2006 12:50]
MySQL Verification Team
Changing Category to Cluster.
[6 Jul 2006 13:37]
Jonas Oreland
The cluster log (and error log) indicates heartbeat failures. This often indicates very high load on cpu/disk/mem/network. Can you examine if 1) there is any swapping going on (using vmstat or similar) 2) ndbd host machines have very high load (using top/vmstat or similar) 3) there has been any peek in load on machine, for example weekly fs-backup which might consume lots of memory/disk bandwith that might have locked ndbd out. Otherwise, can you possibly identify some kind of pattern on mysqld where this occurs (but reading from log, it seems almost idle...)
[6 Jul 2006 14:43]
David Abbott
Lo and behold, further investigation has revealed a disk problem. There did seem to be slow-downs on disk i/o, now we've tested the server in more depth disk errors are being generated. Possibly a need for slightly better error messages ?