MySQL Bugs: #82627: Out of / Reenable event buffer messages flooding the cluster log too fast

Bug #82627	Out of / Reenable event buffer messages flooding the cluster log too fast
Submitted:	18 Aug 2016 9:15	Modified:	22 Aug 2016 15:03
Reporter:	Hartmut Holzgraefe	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	mysql-cluster-7.4.11	OS:	Linux
Assigned to:		CPU Architecture:	Any

Description:
When running into Bug #82394 the "Out of event buffer" and "Reenable event buffer" messages are printed about ten times per second and data node.

On a 2 node cluster this produces log messages so fast that the log files not purged by rotation yet only cover the last 15-20 minutes.

IMHO reenabling the event buffer should only happen if the buffer usage has fallen below a certain low water mark, and not simply on every new epoch.

How to repeat:
Not sure how to trigger this yet, bug 82394 has more information on the incident.

The cluster logs attached to that bug clearly show the log flood though.

The reenable logic right now is just

5305   if(m_out_of_buffer_gci && gci > m_out_of_buffer_gci)
5306   {
5307     jam();
5308     infoEvent("Reenable event buffer");
5309     m_out_of_buffer_gci = 0;
5310     m_missing_data = false;
5311   }
(src/kernel/blocks/suma/Suma.cpp)

so if the event buffer is still full at this point in time it will trigger a new "Out of event buffer" message almost immediately.

Suggested fix:
Do not reenable the event buffer unless there is enough free space available in it again to do so

Added "too fast" to the synopsis

Hi Hartmut,

yes, I agree with you, filling the log like this helps noone :(

Thanks for the report
Bogdan