Bug #74594 Restarts blocked by LCP to copy meta data
Submitted: 27 Oct 2014 19:04 Modified: 23 Dec 2014 11:16
Reporter: Mikael Ronström Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:7.4.2 OS:Any
Assigned to: CPU Architecture:Any

[27 Oct 2014 19:04] Mikael Ronström
Description:
When performing a node restart we get blocked by waiting for LCPs to copy meta data.
This causes node restart time to have a very high degree of variation which means
that it is very hard to say what kind of problem we have with node restarts.

How to repeat:
Start 8 nodes, fill them with 20 GB of data. Restart 4 of them
and study restart times and where time is spent. Great variation
comes from being blocked by LCPs to copy meta data.

Suggested fix:
Ensure that we can block LCP reporting for a time in a quick manner.
by introducing signals to pause LCP execution and flush LCP reports.
[23 Dec 2014 11:16] Jon Stephens
Thank you for your bug report. This issue has already been fixed in the latest released version of that product, which you can download at

  http://www.mysql.com/downloads/

Documented fix in the NDB 7.4.3 changelog, as follows:

    Copying of metadata during local checkpoints caused node restart
    times to be highly variable which could make it difficult to
    diagnose problems with restarts. The fix for this issue
    introduces signals (including PAUSE_LCP_IDLE,
    PAUSE_LCP_REQUESTED, and PAUSE_NOT_IN_LCP_COPY_META_DATA) to
    pause LCP execution and flush LCP reports, making it possible to
    block LCP reporting at times when LCPs during restarts become
    stalled in this fashion.
  
Closed.