Bug #49142 ndbmtd dies while starting phase 5 after drop_caches
Submitted: 26 Nov 2009 17:28 Modified: 30 Dec 2009 20:38
Reporter: Robert Klikics Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:mysql-5.1-telco-7.0 OS:Linux (Debian Etch)
Assigned to: Gustaf Thorslund CPU Architecture:Any
Tags: drop_caches, ndbmtd, telco-7.0.9b

[26 Nov 2009 17:28] Robert Klikics
Description:
We're using the multithreaded ndbd. While we're doing a rolling restart after changing configuration parameters (CompressLCP from 1 to 0), one of our ndb nodes become low memory because of too much use of filesystem and buffer caches (i know normally it is irrelevant, but i use this feature to clean filesystem caches especially for innodb corrupt page failures). So i've initiated a drop of all filesystem caches ( via echo 3 > /proc/sys/vm/drop_caches). After doing this ndbmtd dies with error message:

2009-11-26 17:16:47 [MgmtSrvr] WARNING  -- Node 5: Detected GCP stop(3)...sending kill to [SignalCounter: m_count=1 0000000000000010]
2009-11-26 17:16:50 [MgmtSrvr] ALERT    -- Node 4: Forced node shutdown completed. Occured during startphase 5. Caused by error 2303: 'System error, node killed during node restart by other node(Internal error, programming error or missing error message, please report a bug). Temporary error, rest

A ndb_error_reporter report, which is taken after this failure, is located here: http://85.25.144.101/files/ndb_error_report_20091126180720.tar.bz2

Sincerelly
Martin P.

How to repeat:
Restart a ndb node (rolling restart) and do a drop of all filesystem caches via echo 3 > /proc/sys/vm/drop_caches while starting phase 5.
[30 Nov 2009 17:11] Hartmut Holzgraefe
Raising TimeBetweenEpochsTimeout might help here, in general it is not a good idea to cause extra disk i/o, especially when a restart is going on, as the extra i/o activity may lead to global checkpoint flushes not finishing in time and so causing "GCP stop" node failures ...
[30 Nov 2009 17:37] Robert Klikics
Hi mr. holzgraefe,

thanks for your advice, but if i understood TimeBetweenEpochsTimeout correctly, it's used for mysql-cluster replication, which we do not use.

Sincerelly
Martin
[7 Dec 2009 12:36] Hartmut Holzgraefe
The TimeBetweenEpochsTimeout parameter has been added to improve replication handling, it is always in effect though even when no transaction log is written by any mysqld node in the cluster.

So raising this timeout parameter *may* help to work around the problem.

The main problem is still the extra I/O activity caused by the cache purge, the relaxed timeout setting would help though *if* it is this timeout you are running into which the trace logs seem to indicate.
[7 Dec 2009 14:36] Robert Klikics
Dear mr. holzgraefe,

what should i say. We've kicked the CompressLCP Option, because we've had the feeling that's the cluster runs instable with this option in conjunction with ndbmtd. Also the cluster got a 5x times higher LoadAVG since we've activated CompressLCP.

We leaved the TimeBetween* values untouched, because we'll do not drop the caches again while startup phase. Is there a documentation, how to read the trace files?

Thanks in advance
Martin P.
[30 Dec 2009 20:38] Andrew Hutchings
Hello Robert,

CompressedLCP requires a high CPU overhead (to execute the compression) and in tests causes the LCP to actually run slower.

There is no written documentation on how to read the trace files and it takes a long time to teach.  You need a good understanding of the source to be able to interpret them as well.