Bug #79285 Undocumented negative low estimates in p_s.memory_summary_global_by_event_name
Submitted: 15 Nov 2015 21:38 Modified: 30 Sep 2020 18:18
Reporter: Szymon Komendera Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Performance Schema Severity:S4 (Feature request)
Version:5.7, 5.7.9 OS:Any
Assigned to: CPU Architecture:Any

[15 Nov 2015 21:38] Szymon Komendera
Description:
## BACKGROUND:

Performance schema documentation contains the following statement regarding values for low estimates in memory summary tables:

"
For lower estimates in summary tables other than memory_summary_global_by_event_name, it is possible for values to go negative if memory ownership is transferred between threads.
"

Reference: https://dev.mysql.com/doc/refman/5.7/en/memory-summary-tables.html

The above suggests that "memory_summary_global_by_event_name" should not contain negative values for low estimates, or at least it does not explain why such values may be present in said table.

## ISSUE:

Table "memory_summary_global_by_event_name" reports negative low estimates for internal memory structures. See below for the observed behavior of these counters.

How to repeat:
## GENERAL DESCRIPTION:

1. Launch MySQL 5.7.9 with performance_schema enabled,
2. Enable memory instrumentation and confirm there are no negative values in the table (yet),
3. Generate some traffic against internal/user tables to cause memory allocation/deallocation in internal MySQL structures. 
4. While generating traffic, continuously check counters and observe negative values appearing in the table.

## DEMO:

*** SESSION 1 ***

mysql> status
--------------
mysql  Ver 14.14 Distrib 5.7.9, for Linux (x86_64) using  EditLine wrapper

(cut)
Server version:		5.7.9 Source distribution
Protocol version:	10
Connection:		127.0.0.1 via TCP/IP
(cut)
Uptime:			10 sec

Threads: 1  Questions: 96  Slow queries: 0  Opens: 192  Flush tables: 1  Open tables: 185  Queries per second avg: 9.600
--------------

mysql> update performance_schema.setup_instruments set enabled = 'yes' where name like 'memory/%';
Query OK, 305 rows affected (0.00 sec)
Rows matched: 375  Changed: 305  Warnings: 0

mysql> select * from performance_schema.memory_summary_global_by_event_name where low_count_used < 0\G
Empty set (0.00 sec)

*** SESSION 2 ***

mysql@sandbox:~$ mysql -h127.0.0.1 -P5709 -uroot -ppass
(cut)

mysql> select count(*) from information_schema.tables where table_schema = 'repro';
+----------+
| count(*) |
+----------+
|    20000 |
+----------+
1 row in set (0.08 sec)

Tables in schema "repro" are empty InnoDB tables with two INT columns, named table_1 through table_2000.

mysql> quit

(start running scripts to generate dummy traffic)

mysql@sandbox:~$ for i in `seq 1 10000`; do mysql -h127.0.0.1 -P5709 -uroot -ppass -e "select count(*) from repro.table_$i;" > /dev/null 2>&1; done;
(waited for script completion)
mysql@sandbox:~$ for i in `seq 1 10000`; do mysql -h127.0.0.1 -P5709 -uroot -ppass -e "select count(*) from repro.table_$i;" > /dev/null 2>&1; done;
(waited for script completion)
mysql@sandbox:~$ for i in `seq 1 10000`; do mysql -h127.0.0.1 -P5709 -uroot -ppass -e "select count(*) from repro.table_$i;" > /dev/null 2>&1; done;
(terminated during execution)

*** SESSION 1 ***

While running scripts, pull data from performance_schema.memory_summary_global_by_event_name. In this case I'm pulling selected values for innodb/memory/mem0mem so that the result is more readable, however similar behavior can be seen for other memory/innodb/% events.

mysql@sandbox:~$ while true; do mysql -B --skip-column-names -h127.0.0.1 -P5709 -uroot -ppass -e "select low_count_used, current_count_used, high_count_used from performance_schema.memory_summary_global_by_event_name where event_name = 'memory/innodb/mem0mem';" 2>/dev/null && sleep 1; done;
0	0	0
0	0	0
0	0	0
0	1050	1055 <== this is where the 1st bash loop started
0	4935	4940
0	8790	8795
0	12645	12650
0	16515	16520
0	20385	20390
0	24365	24365
0	28306	28310
0	32053	32058
0	35734	35740
0	39401	39408
0	43083	43089
0	46779	46786
0	50475	50482
0	54171	54178
0	57853	57860
0	61521	61528
0	65217	65223
0	68915	68920
0	72597	72603
0	76280	76286
0	79990	79997
0	83742	83748
0	87424	87430
0	91119	91126
0	94800	94807
0	98498	98502
0	102166	102170
0	105846	105852
0	109515	109521
0	113197	113204
0	116851	116858
0	120518	120525
0	124186	124193
0	127858	127860
0	131591	131597
0	135302	135308
0	139026	139033
0	141993	142000
0	141992	142000
0	141991	142000
0	141991	142000
-112126	29864	142000 <== this is where the 1st bash loop completed
-112126	29864	142000
-112140	29849	142000
-112140	29848	142000
-112140	33362	145508
-112140	37086	149233
-112140	40795	152942
-112140	44520	156666
-112140	48229	160376
-132384	31612	164002
-132384	35337	167727
-132384	39080	171470
-132384	42832	175223
-132384	46528	178919
-132384	50238	182628
-132384	53948	186339
-132384	57671	190062
-132384	61423	193814
-132384	65161	197552
-132384	68870	201261
-132384	72594	204985
-132384	76389	208779
-132384	80141	212532
-132384	83893	216284
-132384	87646	220036
-132384	91439	223830
-132384	95206	227596
-132384	98961	231348
-132384	102681	235071
-132384	106419	238810
-132384	110171	242561
-132384	113966	246356
-132384	117717	250108
-132384	121455	253846
-132384	125207	257598
-132384	128946	261336
-132384	132684	265075
-132384	136408	268798
-132384	140187	272578
-132384	141994	274384
-132384	141996	274385
-132384	141993	274385
-132384	141992	274385
-132384	141992	274385
-244496	29880	274385 <== this is where the 2nd bash loop completed
-244496	29879	274385
-244496	29878	274385
-244496	29878	274385
-244524	30801	275332
-244524	34553	279084
-244524	38291	282821
-244524	42057	286587
-244524	45852	290382
-244524	49618	294148
-244524	53314	297845
-269696	31810	301512
-269696	35565	305268
-269696	35984	305687 <== this is where the 3rd bash loop was interrupted
^C

## OBSERVATIONS

1. We have negative values in the table, which does not seem to be a documented behavior.
2. Value "high_count_used" (3rd column) is always growing and appears to be roughly the sum of "low_count_used" and "current_count_used". This should not happen because, per documentation, "high_count_used" should correspond to maximum value reached by "current_count_used", which is not the case as proven by above output. Here, "current_count_used" drops but "high_count_used" keeps increasing. 

Suggested fix:
Document the behavior and meaning of low/current/high counters in aforementioned p_s table.
[18 Sep 2019 17:39] Vinodh Krish
Hi Team,
We still observe this behaviour. Does this need document fix or bug? Please check and update.
[25 Sep 2019 19:58] Trey Raymond
The values provided are definitely not what is intended to be there, so a root cause fix would be the right answer, but any info about it you can provide in this ticket will help people with temporary workarounds.
Can confirm still an issue in 5.7.26
[30 Sep 2020 18:18] Paul DuBois
Posted by developer:
 
Fixed in 8.0.23.

For the Performance Schema memory_summary_global_by_event_name table,
the low watermark columns could have negative values, and the high
watermark columns had ever-increasing values even when the server
memory usage did not increase.