Bug #61081 False Positives for excessive Number of Long Running Processes
Submitted: 6 May 2011 14:34 Modified: 9 Jan 2015 10:22
Reporter: Daniël van Eeden Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Enterprise Monitor: Advisors/Rules Severity:S2 (Serious)
Version:2.3.4 OS:Any
Assigned to: CPU Architecture:Any

[6 May 2011 14:34] Daniël van Eeden
Description:
The "Excessive Number of Long Running Processes (v 1.1)" rule will give false positives if the server is a slave. 

How to repeat:
Use MySQL 5.5.8 as slave.

The query for long_running_processes in share/mysql-monitor-agent/items/items-mysql-monitor.xml is this:
SELECT COUNT(*) AS num_long_running, @@long_query_time AS long_running_time FROM information_schema.processlist WHERE time > @@long_query_time AND command != 'Sleep';

mysql> SELECT * FROM information_schema.processlist WHERE time > @@long_query_time AND command != 'Sleep'\G 
*************************** 1. row ***************************
     ID: 4
   USER: system user
   HOST: 
     DB: NULL
COMMAND: Connect
   TIME: 16105
  STATE: Waiting for master to send event
   INFO: NULL
1 row in set (0.00 sec)

Suggested fix:
Ignore system user.

SELECT COUNT(*) AS num_long_running, @@long_query_time AS long_running_time FROM information_schema.processlist WHERE time > @@long_query_time AND command != 'Sleep' AND USER<>'system user';
[7 Jun 2011 12:48] Daniël van Eeden
The query was updated from:
SELECT COUNT(*) AS num_long_running, @@long_query_time AS long_running_time FROM
information_schema.processlist WHERE time > @@long_query_time AND command != 'Sleep';

To:
SELECT COUNT(*) AS num_long_running, @@long_query_time AS long_running_time FROM information_schema.processlist WHERE time > @@long_query_time AND command NOT IN ('Sleep', 'Waiting for master to send event', 'Has read all relay log; waiting for the slave I/O thread to update it')

But the correct one is:
SELECT COUNT(*) AS num_long_running, @@long_query_time AS long_running_time FROM information_schema.processlist WHERE time > @@long_query_time AND state NOT IN ('Waiting for master to send event', 'Slave has read all relay log; waiting for the slave I/O thread t') AND command<>'Sleep';

1. 'Waiting for master to send event', 'Has read all relay log; waiting for the slave I/O thread to update it' are never in the command column. They are in the state column.

2. 'Has read all relay log; waiting for the slave I/O thread to update it' will be truncated to 'Slave has read all relay log; waiting for the slave I/O thread t' (at least in 5.5.8)