MySQL Bugs: #108043: Long Message Buffer usage ist constantly increasing

Bug #108043	Long Message Buffer usage ist constantly increasing
Submitted:	1 Aug 2022 10:21	Modified:	5 Oct 2022 17:01
Reporter:	Hendrik Woltersdorf	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	8.0.30	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
Since we upgraded our MySQL clusters (ndbcluster) from generation 7.6 to version 8.0.29 we see a coniniuosly increasing usage of the "Long Message Buffer".
This system is running since the upgrade from 7.6 to 8.0 on the 2. of July.
mysql> SELECT node_id, memory_type, total, used/total *100 as memory_used FROM ndbinfo.memoryusage;
+---------+---------------------+--------------+-------------+
| node_id | memory_type         | total        | memory_used |
+---------+---------------------+--------------+-------------+
|      10 | Data memory         | 154618822656 |      5.5944 |
|      10 | Long message buffer |     67108864 |     41.7969 |
|      11 | Data memory         | 154618822656 |      5.5949 |
|      11 | Long message buffer |     67108864 |     63.2813 |
+---------+---------------------+--------------+-------------+
We see this behaviour in bussy test systems too. The number of pages/KB of used buffer space varies over the day a little bit, but is constantly increasing long term. 
Is this expected behaviour in version 8.0.29 or a sign of a memory leak?
Is there anything, we can do about this? 
A restart of the datanodes (in test systems) resolves this only temporarily.

How to repeat:
run MySQLcluster 8.0.*

logs and configuration from ndb_error_reporter

Attachment: ndb_error_report_20220801120742.zip (application/x-zip-compressed, text), 986.92 KiB.

Hi,

I was running a setup today whole day routing all testing through that cluster and I could not reproduce this. Can you share more data?
 - config (did you configure this manually or using mcm? share your config.ini)
 - ndb_error_reporter
 - what kind of tables you have? are you using blobs? disk data?
 - what kind of traffic you have on that system?

Thanks

You can find the config.ini in the attached file ndb_error_report_20220801120742.zip. The content of this zip got collected by ndb_error_reporter. I just deleted old logs etc., to get a zip file < 3MB.
But so far I did not find anything unusual or susupicious in the logs.
We configured this cluster with a config generator website (which does not exist any more), when we startet with MySQL Cluster 6.0. Since than, we adapted the config.ini manually. 
We use disk tables, especially to store BLOBs. This tables see a lot off insert/update/delete activity over the day; with peaks in the "rush hours".
It takes days (at least), to see the constant increase of the Long Message Buffer usage. This system is running for one month now. On week days the sum of the binlogs for replication is about 12GB - 15GB on "normal" days.

full cluster config of 2 systems

Attachment: full_configs.zip (application/x-zip-compressed, text), 8.88 KiB.

In full_configs.zip I added the output of "ndb_mgmd ... -P":
dit_l1_cfg.txt: test system where we see the problem
qit_l1_cfg.txt: test system where we do not see the problem
prod_l1_cfg.txt: production system with the problem

Hi,

The file was not yet visible when I sent the first answer, thanks, I have what I need to setup the test env but unfortunately running tests with your config (close to your config) I cannot reproduce this problem.

Anything else you can share that can make reproducing of the issue possible? 

Thanks

I have no idea, what exactly is causing this issue.
But what do we do, what (almost) no one else does?
We use triggers on ndbcluster tables.
I added a file with an example to illustrate, how we do that.

Last weekend we restartet the datanodes of two of our bussy clusters, because one datanode reached the wartermark of (over) 70%.
Here are the before and after values:
L1:
mysql> SELECT node_id, memory_type, total, used/total *100 as memory_used FROM ndbinfo.memoryusage;
+---------+---------------------+--------------+-------------+
| node_id | memory_type         | total        | memory_used |
+---------+---------------------+--------------+-------------+
|      10 | Data memory         | 154618822656 |      7.6680 |
|      10 | Long message buffer |     67108864 |     49.4141 |
|      11 | Data memory         | 154618822656 |      7.6682 |
|      11 | Long message buffer |     67108864 |     74.0234 |
+---------+---------------------+--------------+-------------+
+---------+---------------------+--------------+-------------+
| node_id | memory_type         | total        | memory_used |
+---------+---------------------+--------------+-------------+
|      10 | Data memory         | 154618822656 |      7.1538 |
|      10 | Long message buffer |     67108864 |      2.7344 |
|      11 | Data memory         | 154618822656 |      7.1603 |
|      11 | Long message buffer |     67108864 |      2.9297 |
+---------+---------------------+--------------+-------------+

L2:
mysql> SELECT node_id, memory_type, total, used/total *100 as memory_used FROM ndbinfo.memoryusage;
+---------+---------------------+--------------+-------------+
| node_id | memory_type         | total        | memory_used |
+---------+---------------------+--------------+-------------+
|      10 | Data memory         | 154618822656 |      7.0820 |
|      10 | Long message buffer |     67108864 |     36.7188 |
|      11 | Data memory         | 154618822656 |      7.0820 |
|      11 | Long message buffer |     67108864 |     57.4219 |
+---------+---------------------+--------------+-------------+
+---------+---------------------+--------------+-------------+
| node_id | memory_type         | total        | memory_used |
+---------+---------------------+--------------+-------------+
|      10 | Data memory         | 154618822656 |      6.8489 |
|      10 | Long message buffer |     67108864 |      3.1250 |
|      11 | Data memory         | 154618822656 |      6.8483 |
|      11 | Long message buffer |     67108864 |      2.5391 |
+---------+---------------------+--------------+-------------+

One observation: We see in all of our affected systems an asymmetry in the usage values;  always a ratio of 2:3.

Hi Hendrik,

We found some bugs with 8.0.29 and released 8.0.30. I doubt it has to do with your issue but worth a look.

I tried reproducing your issue, with triggers, but I cannot reproduce the issue (I am using your SQL you sent and 8.0.30).

I'm still working on this but if you can retest on 8.0.30 it would help.

thanks

I'm going to test 8.0.30, but it will take some time, until it hits our bussy clusters. 
The triggers were just a wild guess. 
Currently the issue did not yet occure again in the two restarted systems from my last comment. So I guess, it needs some starter event or - more likely - a certain load to start it.
But the load on all our systems is ATM rather low because of the holiday season in Europe.

Hi,

I tested under pretty high load (easy to have high load if you run on small machines) and could not reproduce so load might not be the trigger but rather something else, but without a way to reproduce only thing I can do is code analysis and I went through the code so many times...

Let us know if you manage to reproduce with 8.0.30 and if you managed to get any other clues on how to reproduce so I can retry

thanks

After running one test cluster with version 8.0.30 the problem starts to show up again (after 6 days):
(ndb_mgm show output)
[ndbd(NDB)]	2 node(s)
id=3	@62.96.229.52  (mysql-8.0.30 ndb-8.0.30, Nodegroup: 0, *)
id=4	@62.96.229.53  (mysql-8.0.30 ndb-8.0.30, Nodegroup: 0)
-----
node_id	memory_type	total	memory_used
3	Long message buffer	67108864	5.0781%
4	Long message buffer	67108864	7.6172%

But I made another observation. I'm not sure, if it is really a pattern.
It seems to be less likely (at least), that the problem occures, when the datanode with the higher id has the "leader" role (running 20 days):

[ndbd(NDB)]	2 node(s)
id=3	@62.96.229.145  (mysql-8.0.29 ndb-8.0.29, Nodegroup: 0)
id=4	@62.96.229.147  (mysql-8.0.29 ndb-8.0.29, Nodegroup: 0, *)
---
node_id	memory_type	total	memory_used
3	Long message buffer	67108864	1.1719%
4	Long message buffer	67108864	0.9766%
----

Also the "bussy" cluster, we restarted lately (see  my comment from 2022-08-08), did not yet show the problem again and since the restart the "leader" role is now on the datanode with the higher id.

Here are the values from my.cnf of the sql nodes of our clusters, relevant for ndb (one example):

[MYSQLD]
ndb-cluster-connection-pool=8
ndbcluster
ndb-connectstring="62.96.229.141:1186;62.96.229.143:1186"
ndb-force-send=1
ndb-use-exact-count=0
ndb-extra-logging=1
ndb-autoincrement-prefetch-sz=1

Note "ndb-autoincrement-prefetch-sz=1". We need this for one use case (FIFO) and it causes lots of traffic between the cluster nodes.

Hi,

> It seems to be less likely (at least), that the problem occures, when the datanode with the higher id has the "leader" role (running 20 days)

I think this is impossible (looking the way it works), you probably just do not have the issue because you restarted the system, the "leader has higher id" by chance in this case.

I'll discuss more with ndb devs about this but I'm having issues reproducing this.

thanks

Hi,

I managed to reproduce with 8.0.30 and ndb-autoincrement-prefetch-sz=1 ... looks like a lot of single record inserts from number of sql nodes did the trick. With high value for ndb-autoincrement-prefetch-sz I cannot reproduce but this might just allow for easier reproduction due to higher traffic.

Thanks for the report

Posted by developer:
 
If you turn off load when long message buffer usage is high, will it go down to same value as before putting on load?

Posted by developer:
 
If you turn off load when long message buffer usage is high, will it go down to same value as before putting on load?

Posted by developer:
 
The long message buffer uses per thread cache and the free entries in the caches are also counted as in use in the ndbinfo.memoryusage table.
An improvement would be to be able to also be able to see free entries in the caches.

Cache grows and shrinks in units of 512 entries (128KB), and it only shrink if there are 513 or more free entries in the cache.

When system is idle one could expect that each thread has one or two sets of cached entries, that is up to 1024 entries (or 256KB per thread) per thread.
With a total of 67108864 bytes, each thread is expected to keep less than 0.4% each.
With 32 threads up to 12.5% (32*1024*256/67108864) of long message buffer usage seems reasonable for an idle system.

When the load goes down, the usage is decreased by max. 1%. The average usage values grow constantly in the affected systems.

Posted by developer:
 
We have identified a leak of Long message buffer memory related to index statistics.

Every time an index is scanned for updating index statistics one unit (256 bytes) of Long message buffer memory is leaked.

One can look for logs like below in cluster log (produced by management server):

    [MgmtSrvr] INFO     -- Node 2: index 16 stats version 3: scan frag: created 10 samples

The above example tells that node 2 scanned one fragment for statistics and by that leaked 256 bytes of Long message buffer.

We see such messages (created ... samples) in the logs, but rather rarely. I'm not sure if this is all.

Posted by developer:
 
In the uploaded ndb_l1_2_cluster.log there were roughly 300 'scan frag: created' logs per hour and data node which would correspond to almost 3% leak of Long message buffer per day.
But of course this log frequence may not be representative for all times.
Index statistics are triggered more often when index changes much.

Documented fix as follows in the NDB 7.5.29, 7.6.25, and 8.0.32 changelogs:

    Removed a leak of long message buffer memory that occurred each
    time an index was scanned for updating index statistics.

Closed.