MySQL Bugs: #78530: Fix scheduler to avoid too long runs in large data nodes without sending

Bug #78530	Fix scheduler to avoid too long runs in large data nodes without sending
Submitted:	23 Sep 2015 13:31	Modified:	5 Jan 2016 15:59
Reporter:	Mikael Ronström	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	7.4.7	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
In ndbmtd we only check for sending signals after a full turn in run_job_buffers which
loops over all job buffer inputs. This means that we can potentially run 75 * number of
buffers. So roughly around 1000-2000 signals can be executed. This means that we can
execute for upto 4-5 milliseconds without sending which obviously will have a very bad
impact on response time and more particularly it will make the nodes to much behave
like batch jobs.

How to repeat:
Not really repeatable, but found through code review

Suggested fix:
Handle sending and other flushing activities as part of run_job_buffers routine.

Documented fix as follows in the NDB 7.4.9 changelog:

    ndbmtd checked for signals being sent only after a full cycle in
    run_job_buffers, which is performed for all job buffer inputs. Now this
    is done as part of run_job_buffers itself, avoiding use of potential
    execution time when there are no signals.

Closed.