Bug #108288 HIGH_NUMBER_OF_BYTES_USED is incorrect
Submitted: 25 Aug 2022 22:22 Modified: 26 Aug 2022 13:08
Reporter: Philip Gladstone Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.29 OS:Any
Assigned to: CPU Architecture:Any

[25 Aug 2022 22:22] Philip Gladstone
Description:
The HIGH_NUMBER_OF_BYTES_USED column in performance_schema.memory_summary_global_by_event_name does not appear to be correct. For example, this is the output from two identical queries fairly close together in time (in the same session):

```
mysql> SELECT *        FROM performance_schema.memory_summary_global_by_event_name where HIGH_NUMBER_OF_BYTES_USED > 1000000000 and EVENT_NAME like '%openssl%';
+--------------------------------------+-------------+------------+---------------------------+--------------------------+----------------+--------------------+-----------------+--------------------------+------------------------------+---------------------------+
| EVENT_NAME                           | COUNT_ALLOC | COUNT_FREE | SUM_NUMBER_OF_BYTES_ALLOC | SUM_NUMBER_OF_BYTES_FREE | LOW_COUNT_USED | CURRENT_COUNT_USED | HIGH_COUNT_USED | LOW_NUMBER_OF_BYTES_USED | CURRENT_NUMBER_OF_BYTES_USED | HIGH_NUMBER_OF_BYTES_USED |
+--------------------------------------+-------------+------------+---------------------------+--------------------------+----------------+--------------------+-----------------+--------------------------+------------------------------+---------------------------+
| memory/mysqld_openssl/openssl_malloc |   435092455 |  435069816 |               46684243417 |              46665167493 |              0 |              22639 |         3466074 |                        0 |                     19075924 |                4088561393 |
+--------------------------------------+-------------+------------+---------------------------+--------------------------+----------------+--------------------+-----------------+--------------------------+------------------------------+---------------------------+
1 row in set (0.06 sec)

mysql> SELECT *        FROM performance_schema.memory_summary_global_by_event_name where HIGH_NUMBER_OF_BYTES_USED > 1000000000 and EVENT_NAME like '%openssl%';
+--------------------------------------+-------------+------------+---------------------------+--------------------------+----------------+--------------------+-----------------+--------------------------+------------------------------+---------------------------+
| EVENT_NAME                           | COUNT_ALLOC | COUNT_FREE | SUM_NUMBER_OF_BYTES_ALLOC | SUM_NUMBER_OF_BYTES_FREE | LOW_COUNT_USED | CURRENT_COUNT_USED | HIGH_COUNT_USED | LOW_NUMBER_OF_BYTES_USED | CURRENT_NUMBER_OF_BYTES_USED | HIGH_NUMBER_OF_BYTES_USED |
+--------------------------------------+-------------+------------+---------------------------+--------------------------+----------------+--------------------+-----------------+--------------------------+------------------------------+---------------------------+
| memory/mysqld_openssl/openssl_malloc |   435148837 |  435125910 |               46690592744 |              46671143556 |              0 |              22927 |         3466780 |                        0 |                     19449188 |                4089464053 |
+--------------------------------------+-------------+------------+---------------------------+--------------------------+----------------+--------------------+-----------------+--------------------------+------------------------------+---------------------------+
1 row in set (0.05 sec)

```

The second query shows that the HIGH_NUMBER_OF_BYTES_USED has increased since the first query. This implies that the memory for this event increased from  19075924 (the original USED) to 4089464053 (the new HIGH) and then back down to 19449188 (the new USED).

This implies that, at least, 4089464053 - 19075924 (= 4,070,388,129) bytes must have been allocated in this interval (assuming no frees until the new HIGH was reached).

However, the number of bytes allocated during this period is exactly (new SUM_ALLOC - old SUM_ALLOC) 46690592744 - 46684243417 (= 6,349,327)

Thus, since ~6MB is much less than ~4TB, I conclude that at least one of the numbers is incorrect. I suspect that the incorrect number is the HIGH_NUMBER_OF_BYTES_USED

This is making it difficult to debug an out-of-memory problem...

How to repeat:
Run the following SQL on a database. If the result shows nothing, then runs some workload against the database, and then `drop table t2` and recreate it and then run the final query. Any output from the final query demonstrates the problem.

create temporary table t1 select * FROM performance_schema.memory_summary_global_by_event_name;

show status like 'foo';
show full processlist;
select sleep(10);
select * from t1;

create temporary table t2 select * FROM performance_schema.memory_summary_global_by_event_name;

select t1.EVENT_NAME, t2.HIGH_NUMBER_OF_BYTES_USED - t1.CURRENT_NUMBER_OF_BYTES_USED as RequiredAlloc, t2.SUM_NUMBER_OF_BYTES_ALLOC - t1.SUM_NUMBER_OF_BYTES_ALLOC as ActualAlloc from t1 join t2 on t1.EVENT_NAME = t2.EVENT_NAME and t1.HIGH_NUMBER_OF_BYTES_USED <> t2.HIGH_NUMBER_OF_BYTES_USED where t2.HIGH_NUMBER_OF_BYTES_USED - t1.CURRENT_NUMBER_OF_BYTES_USED > t2.SUM_NUMBER_OF_BYTES_ALLOC - t1.SUM_NUMBER_OF_BYTES_ALLOC;

Sample output in my case:
+--------------------------------------+---------------+-------------+
| EVENT_NAME                           | RequiredAlloc | ActualAlloc |
+--------------------------------------+---------------+-------------+
| memory/sql/MPVIO_EXT::auth_info      |         40104 |          41 |
| memory/mysqld_openssl/openssl_malloc |      14270061 |       53024 |
| memory/temptable/physical_ram        |       3145824 |     1048608 |
+--------------------------------------+---------------+-------------+
3 rows in set (0.03 sec)
[26 Aug 2022 12:08] MySQL Verification Team
Hi Mr. Gladstone,

Thank you for your bug report and test case.

We have run your test case dozen of times, with our official 8.0.30 binary, and we were totally unable to repeat the results that you have got.

We need a total test case or some specific server settings that you have changed in the configuration file from the default ones, and we might try to repeat it again.

Can't repeat.
[26 Aug 2022 12:13] MySQL Verification Team
Hi,

We have analysed further and discovered that this fix in 8.0.30 might be the reason on why we can not repeat the results that you get:

Events recorded in the Performance Schema tables for thread creation and deletion were retained until server shutdown, instead of being removed when the client connection ended. Thread creation and deletion now takes place after the Performance Schema instrumentation is created for the user session, so it is cleaned up when the session ends. (Bug #XXXXXX) - an internal bug, not visible to the public.
[26 Aug 2022 13:08] Philip Gladstone
Once AWS supports 8.0.30, I will retest.

Thank you.
[29 Aug 2022 11:51] MySQL Verification Team
Mr. Gladstone,

You are welcome .....