Bug #78530 Fix scheduler to avoid too long runs in large data nodes without sending
Submitted: 23 Sep 2015 13:31 Modified: 5 Jan 2016 15:59
Reporter: Mikael Ronström Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:7.4.7 OS:Any
Assigned to: CPU Architecture:Any

[23 Sep 2015 13:31] Mikael Ronström
Description:
In ndbmtd we only check for sending signals after a full turn in run_job_buffers which
loops over all job buffer inputs. This means that we can potentially run 75 * number of
buffers. So roughly around 1000-2000 signals can be executed. This means that we can
execute for upto 4-5 milliseconds without sending which obviously will have a very bad
impact on response time and more particularly it will make the nodes to much behave
like batch jobs.

How to repeat:
Not really repeatable, but found through code review

Suggested fix:
Handle sending and other flushing activities as part of run_job_buffers routine.
[5 Jan 2016 15:59] Jon Stephens
Documented fix as follows in the NDB 7.4.9 changelog:

    ndbmtd checked for signals being sent only after a full cycle in
    run_job_buffers, which is performed for all job buffer inputs. Now this
    is done as part of run_job_buffers itself, avoiding use of potential
    execution time when there are no signals.

Closed.