Bug #49230 | ndbmtd forced to restart while doing a GCP | ||
---|---|---|---|
Submitted: | 30 Nov 2009 19:30 | Modified: | 1 Jan 2010 9:27 |
Reporter: | Robert Klikics | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | mysql-5.1-telco-7.0 | OS: | Linux (Debian 5.0) |
Assigned to: | Andrew Hutchings | CPU Architecture: | Any |
Tags: | GCP, ndbmtd, telco-7.0.9b |
[30 Nov 2009 19:30]
Robert Klikics
[31 Dec 2009 18:38]
Andrew Hutchings
Hello Robert, This looks like a normal GCP stop error please check out the information on preventing GCP stop entitled "Disk Data and GCP Stop errors." at the bottom of: http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-ndbd-definition.html Also note that your cluster is complaining about send buffer problems and in fact at least one crash about 10 days before the GCP stop is due to this. Please consider increasing SendBufferMemory.
[1 Jan 2010 8:49]
Robert Klikics
Hi Andrew, thank's in advance fpr you reply and hint. I've read the documentation and there is basicly spoken about GCP Stop errors associated with Disk Data Tables, which we're NOT using. We've also spoken to the percona guy's, which have meant that the send buffer values are allright now. So since we've switched back to ndbd instead off ndbmtd, the cluster run's "stable" since about 1 1/2 months. I know that multithreaded app's are hard to code, but imho the ndbmtd is not ready for productional use. Sincerelly Martin P.
[1 Jan 2010 9:16]
Andrew Hutchings
Most of that is still valid without disk tables but unfortunately tuning your cluster is beyond the scope of a bug report. We do have customers running very large clusters using ndbmtd and it is very stable for them, we are sorry this has not been your experience.
[1 Jan 2010 9:27]
Robert Klikics
Hi Andrew, this should not be a personal attack against you or the programmer's. But how did the customers get this running stable? We've had a 2 day intensive training with the percona guy's, once a mysql-engineer from sun was here, which looked over the configuration and meant that's ok, we've tryed so many configurations and so on. But the longest time, the ndbmtd was running w/o a failure, was about 2 week's. So please tell me, is there a hidden switch or something else :-) ? BTW Happy New Year. Sincerelly Martin P.
[1 Jan 2010 9:59]
Andrew Hutchings
Hello Martin, In most cases it is a tuning effort which is greatly dependant on the application and hardware used. Unfortunately this is beyond the scope of a bug report, but I would be happy to discuss this if you contact me directly, use the cluster mailing list / forum or alternatively our Professional Services team will be able to help you out. Happy new year to you too! :)