Bug #91710 after upgrading from 5.6.32 to 5.7.22(jemalloc-5.1), memory usage keeps growing
Submitted: 19 Jul 2018 6:21 Modified: 8 Oct 2018 14:09
Reporter: Suantai Rousi Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.7.22,5.7.23 OS:Red Hat (6.5)
Assigned to: MySQL Verification Team CPU Architecture:x86 (64bit)
Tags: jemalloc-5.1.0, M-M replication, Memory Growing, MTS, swap

[19 Jul 2018 6:21] Suantai Rousi
Description:
A production environment mysql database, after upgrading from 5.6.32 to 5.7.22, memory usage keeps growing, already using 30% swap
Buffer pool size and other configurations are very small.
When using 5.6.32, the memory usage is normal.
I can't find a reason to use so much memory?
a memory leak or a bug? 5.7.22

my-5.7.22 configuration: 
##Global 
innodb_buffer_pool_size = 20G 
key_buffer_size = 2M 
tmp_table_size = 64M 
max_heap_table_size = 64M 
innodb_log_buffer_size = 4M 
## Per connection 
read_buffer_size = 2M 
read_rnd_buffer_size = 2M 
sort_buffer_size = 8M 
join_buffer_size = 8M 
binlog_cache_size = 4M 
## Others 
myisam_sort_buffer_size = 4M 
myisam_max_sort_file_size = 10G 
bulk_insert_buffer_size = 16M 
## Query cache OFF 
query_cache_size = 0 
query_cache_type = 0 
## Threads 
thread_stack = 512K 
thread_cache_size = 1000 

my-5.6.32 configuration: 
##Global 
innodb_buffer_pool_size = 30G 
key_buffer_size = 8M 
tmp_table_size = 64M 
max_heap_table_size = 64M 
innodb_log_buffer_size = 8M 
## Per connection 
read_buffer_size = 16M 
read_rnd_buffer_size = 16M 
sort_buffer_size = 16M 
join_buffer_size = 16M 
binlog_cache_size = 4M 
## Others 
myisam_sort_buffer_size = 8M 
myisam_max_sort_file_size = 10G 
bulk_insert_buffer_size = 32M 
## Query cache OFF 
query_cache_size = 0 
query_cache_type = 0 
## Threads 
thread_stack = 512K 
thread_cache_size = 1000

[mysql@ugc215 ~]$ free -mt
             total       used       free     shared    buffers     cached
Mem:         64375      61730       2644          0        387        909
-/+ buffers/cache:      60433       3941
Swap:        30719       4428      26291
Total:       95095      66158      28936

mysql    12159 88.9 92.0 89798836 60648124 ?   Sl   Jul13 8099:27 /usr/local/mysql5722/bin/mysqld --basedir=/usr/local/mysql5722 --datadir=/data/mysql --plugin-dir=/usr/local/mysql5722/lib/plugin --user=mysql --log-error=/log/mysql/my3306.err --open-files-limit=65535 --pid-file=/data/mysql/mysql3306.pid --socket=/tmp/mysql.sock --port=3306

How to repeat:
unknown

Suggested fix:
unknown
[19 Jul 2018 6:23] Suantai Rousi
my.cnf

Attachment: my.cnf (application/octet-stream, text), 4.81 KiB.

[19 Jul 2018 6:23] Suantai Rousi
os memory

Attachment: OS_memory.txt (text/plain), 3.33 KiB.

[19 Jul 2018 6:23] Suantai Rousi
sys memory results

Attachment: mysql_sys_memory.txt (text/plain), 39.09 KiB.

[19 Jul 2018 6:24] Suantai Rousi
global variables

Attachment: global_variables.txt (text/plain), 14.83 KiB.

[19 Jul 2018 6:24] Suantai Rousi
global status

Attachment: global_status.txt (text/plain), 36.19 KiB.

[19 Jul 2018 6:28] Suantai Rousi
The secondary node(ugc216), which consumes more memory than the primary node, has used 50% of the swap.
[23 Jul 2018 11:35] MySQL Verification Team
Hi,
This does look like a bug but there's not enough info to determine where the memory is lost. If you restart the server memory usage should return to normal. Can you see what actions are performed on the server when the ram start to be used and not released?

all best
Bogdan
[24 Jul 2018 6:39] Suantai Rousi
Thank you for your reply.
Last Thursday, the swap usage of the secondary node suddenly increased rapidly to 70%. To prevent OOM, I restarted the secondary node. 
Interestingly,after that,the memory usage of the primary node also slowly declined,and it is still Continue to decline.
In normal times, the number of connections and active sessions in this database are below 50.
After upgrading to 5.7.22, the client has switched over the connection. The front-end application briefly connects to the secondary node, After that, it has always been connected to the primary node for external services.
[27 Jul 2018 19:33] MySQL Verification Team
Hi,

Is there a chance you, for a short while, remove all connections from the affected node and monitor 
 - connections (see if all connections are closed)
 - memory usage (does the memory start getting released back to os)

thanks
Bogdan
[28 Aug 2018 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[4 Sep 2018 2:09] Suantai Rousi
Last friday, we restarted two mysqld and the new configuration file has been uploaded.
Until today, the master node is working fine, but the secondary node begins to reproduce the previous phenomenon, memory availability has been declining (cached keeps dropping, anonymous pages keep rising).
[4 Sep 2018 2:10] Suantai Rousi
master cnf

Attachment: master.cnf (application/octet-stream, text), 4.85 KiB.

[4 Sep 2018 2:10] Suantai Rousi
slave cnf

Attachment: slave.cnf (application/octet-stream, text), 4.89 KiB.

[4 Sep 2018 2:10] Suantai Rousi
slave mysql overview

Attachment: overview.png (image/png, text), 77.54 KiB.

[4 Sep 2018 2:11] Suantai Rousi
slave memory

Attachment: memory-slave.png (image/png, text), 30.29 KiB.

[4 Sep 2018 2:12] Suantai Rousi
slave memory detail

Attachment: memorydetail-slave.png (image/png, text), 122.00 KiB.

[4 Sep 2018 2:22] Suantai Rousi
master global status

Attachment: status-master.txt (text/plain), 35.41 KiB.

[4 Sep 2018 2:22] Suantai Rousi
slave global status

Attachment: status-slave.txt (text/plain), 35.41 KiB.

[4 Sep 2018 2:34] Suantai Rousi
master node:
ps aux 
USER       PID   %CPU  %MEM    VSZ          RSS         TTY      STAT    START    TIME         COMMAND
root     20063    0.0      0.0        106228       1508          ?        S          Aug31    0:00          /bin/sh /usr/local/mysql5722/bin/mysqld_safe --datadir=/data/mysql --pid-file=/data/mysql/mysql3306.pid
mysql   23315  114     62.9       58501960   41523640   ?        Sl        Aug31    7056:33    /usr/local/mysql5722/bin/mysqld --basedir=/usr/local/mysql5722 --datadir=/data/mysql --plugin-dir=/usr/local/mysql5722/lib/plugin --user=mysql --log-error=/log/mysql/my3306.err --open-files-limit=65535 --pid-file=/data/mysql/mysql3306.pid --socket=/tmp/mysql.sock --port=3306

free -mt
                                total          used         free      shared    buffers     cached
Mem:                       64375       61330       3044          0        666         18382
-/+ buffers/cache:                    42280       22094
Swap:                       30719          6            30713
Total:                        95095      61336       33758

seconde node:
USER       PID   %CPU  %MEM    VSZ          RSS          TTY      STAT    START     TIME       COMMAND
root      27323  0.0       0.0        106228      1508           ?        S           Aug31    0:00         /bin/sh /usr/local/mysql5722/bin/mysqld_safe --datadir=/data/mysql --pid-file=/data/mysql/mysql3306.pid
mysql   30631  5.3       66.2      81011056   43668620   ?       Sl           Aug31   328:57     /usr/local/mysql5722/bin/mysqld --basedir=/usr/local/mysql5722 --datadir=/data/mysql --plugin-dir=/usr/local/mysql5722/lib/plugin --user=mysql --log-error=/log/mysql/my3306.err --open-files-limit=65535 --pid-file=/data/mysql/mysql3306.pid --socket=/tmp/mysql.sock --port=3306

free -mt
                                total         used          free      shared    buffers     cached
Mem:                       64375      61329        3045          0        574          16551
-/+ buffers/cache:                   44204        20171
Swap:                       30719          0            30719
Total:                        95095      61329        33765
[4 Sep 2018 7:29] Suantai Rousi
performance schema memory summary by events

Attachment: P_S-summary.txt (text/plain), 7.63 KiB.

[5 Sep 2018 3:01] Suantai Rousi
In addition, in version 5.7, jemalloc-5.1.0 was used.
[15 Sep 2018 5:16] Suantai Rousi
Event:sql/Log_event  uses memory to rise indefinitely, is it normal? 
This issue will occur in the most recent version 5.7.22~23.

[HTML]
select THREAD_ID, EVENT_NAME, COUNT_ALLOC, SUM_NUMBER_OF_BYTES_ALLOC/1024/1024, CURRENT_NUMBER_OF_BYTES_USED/1024/1024 from performance_schema.memory_summary_by_thread_by_event_name order by CURRENT_NUMBER_OF_BYTES_USED desc limit 10;
+-----------+----------------------------+-------------+-------------------------------------+----------------------------------------+
| THREAD_ID | EVENT_NAME                 | COUNT_ALLOC | SUM_NUMBER_OF_BYTES_ALLOC/1024/1024 | CURRENT_NUMBER_OF_BYTES_USED/1024/1024 |
+-----------+----------------------------+-------------+-------------------------------------+----------------------------------------+
|         1 | memory/innodb/buf_buf_pool |         240 |                      31455.00000000 |                         31455.00000000 |
|       507 | memory/sql/Log_event       |    18400264 |                       7564.47873688 |                          7564.39571095 |
|         1 | memory/innodb/hash0hash    |          92 |                       2356.76754761 |                           897.66287231 |
|         1 | memory/innodb/os0event     |     5899209 |                        765.12567902 |                           765.12567902 |
|       506 | memory/innodb/std          |    38099468 |                       1599.35388184 |                           193.64311218 |
|         1 | memory/innodb/log0log      |          11 |                         64.01335907 |                            64.01335907 |
|       508 | memory/innodb/std          |     8876486 |                        396.79184723 |                            38.66751099 |
|       507 | memory/innodb/std          |     1776900 |                         74.58917236 |                             9.03167725 |
|       509 | memory/mysys/IO_CACHE      |           6 |                          8.03128242 |                             8.03128242 |
|       508 | memory/mysys/IO_CACHE      |           6 |                          8.03128242 |                             8.03128242 |
+-----------+----------------------------+-------------+-------------------------------------+----------------------------------------+
10 rows in set (0.03 sec)

select thread_id, user, current_allocated, total_allocated from sys.memory_by_thread_by_current_bytes;
+-----------+---------------------------------+-------------------+-----------------+
| thread_id | user                            | current_allocated | total_allocated |
+-----------+---------------------------------+-------------------+-----------------+
|         1 | sql/main                        | 32.42 GiB         | 34.08 GiB       |
|       507 | sql/slave_sql                   | 7.42 GiB          | 12.10 GiB       |
|       506 | sql/slave_io                    | 198.30 MiB        | 100.58 GiB      |
|     38914 | root@localhost                  | 8.53 MiB          | 21.64 MiB       |
|       510 | sql/slave_worker                | 8.06 MiB          | 8.21 MiB        |
|       511 | sql/slave_worker                | 42.74 KiB         | 42.90 KiB       |
|     39514 | root@10.25.150.203              | 16.84 KiB         | 28.59 MiB       |
|     39515 | root@10.25.150.203              | 16.84 KiB         | 40.66 MiB       |
|     39516 | root@10.25.150.203              | 16.84 KiB         | 24.54 MiB       |
|      3225 | repl@10.25.150.202              | 13.05 KiB         | 63.97 KiB       |
|        31 | innodb/buf_dump_thread          | 192 bytes         | 1.74 MiB        |
|        30 | innodb/srv_worker_thread        | 0 bytes           | 0 bytes         |
|        39 | sql/signal_handler              | 0 bytes           | 0 bytes         |
|        40 | sql/compress_gtid_table         | 0 bytes           | 0 bytes         |
|         2 | sql/thread_timer_notifier       | 0 bytes           | 0 bytes         |
|         3 | innodb/io_read_thread           | 0 bytes           | 33.24 KiB       |
|         4 | innodb/io_ibuf_thread           | 0 bytes           | 0 bytes         |
|         5 | innodb/io_read_thread           | 0 bytes           | 27.16 KiB       |
|         6 | innodb/io_log_thread            | 0 bytes           | 0 bytes         |
|         7 | innodb/io_read_thread           | 0 bytes           | 5.67 KiB        |
|         8 | innodb/io_read_thread           | 0 bytes           | 50.03 KiB       |
|         9 | innodb/io_read_thread           | 0 bytes           | 44.77 KiB       |
|        10 | innodb/io_read_thread           | 0 bytes           | 11.35 KiB       |
|        11 | innodb/io_read_thread           | 0 bytes           | 22.70 KiB       |
|        12 | innodb/io_read_thread           | 0 bytes           | 22.70 KiB       |
|        13 | innodb/io_write_thread          | 0 bytes           | 1.68 MiB        |
|        14 | innodb/io_write_thread          | 0 bytes           | 384.44 KiB      |
|        15 | innodb/io_write_thread          | 0 bytes           | 650.33 KiB      |
|        16 | innodb/io_write_thread          | 0 bytes           | 223.01 KiB      |
|        17 | innodb/io_write_thread          | 0 bytes           | 294.46 KiB      |
|        18 | innodb/io_write_thread          | 0 bytes           | 218.54 KiB      |
|        19 | innodb/io_write_thread          | 0 bytes           | 274.54 KiB      |
|        20 | innodb/io_write_thread          | 0 bytes           | 242.21 KiB      |
|        21 | innodb/page_cleaner_thread      | 0 bytes           | 0 bytes         |
|        23 | innodb/srv_error_monitor_thread | 0 bytes           | 0 bytes         |
|        24 | innodb/srv_lock_timeout_thread  | 0 bytes           | 0 bytes         |
|        25 | innodb/srv_monitor_thread       | 0 bytes           | 18.92 MiB       |
|        26 | innodb/srv_master_thread        | 0 bytes           | 14.24 MiB       |
|        27 | innodb/srv_worker_thread        | 0 bytes           | 0 bytes         |
|        28 | innodb/srv_purge_thread         | 0 bytes           | 0 bytes         |
|        29 | innodb/srv_worker_thread        | 0 bytes           | 0 bytes         |
|        32 | innodb/dict_stats_thread        | -160 bytes        | 218.36 KiB      |
|       509 | sql/slave_worker                | -5122994 bytes    | 58.06 MiB       |
|       508 | sql/slave_worker                | -7894888734 bytes | 28.29 GiB       |
+-----------+---------------------------------+-------------------+-----------------+
44 rows in set (0.28 sec)
[/HTML]
[17 Sep 2018 15:31] MySQL Verification Team
Hi,

I need you to answer few questions if possible.

How much time it takes for the sql/Log_event to jump to 7G ?

Do you, on master, have queries that explicitly create temporary tables, and how often?

Do you, on master, load data infile, to tables, and how often?

What client you use on master?

How many connections you have on slave?

thanks
Bogdan
[20 Sep 2018 5:59] Suantai Rousi
Thank you for your reply.
All problematic instances,using jemalloc-5.1.0 and MTS

question 1:How much time it takes for the sql/Log_event to jump to 7G ?
answer 1:3~5days

question 2:Do you, on master, have queries that explicitly create temporary tables, and how often?
answer 2:no explicitly create temporary tables

question 3:Do you, on master, load data infile to tables, and how often?
answer 3:no load data infile

question 4:What client you use on master?
answer 4:jdbc

question 5:How many connections you have on slave?
answer 5:The secondary node is only used for emergency backup, no any service connection.
[3 Oct 2018 9:36] MySQL Verification Team
Hi,
I still can't reproduce this.
Can you reproduce this without jemalloc?

Can you reproduce this with Oracle made binary?

I can't verify this with any of the Oracle binaries I tested

thanks
Bogdan
[3 Oct 2018 17:32] MySQL Verification Team
btw, since you have no user connections on the slave machine that goes haywire, can you drop the thread_cache_size to zero to see if that changes anything.

thanks
b.
[8 Oct 2018 13:52] Suantai Rousi
Thanks for your reply.
The service is stable after removing jemalloc-5.1.0.
Does this mean that it is an unknown bug in jemalloc-5.1.0?
[8 Oct 2018 14:09] MySQL Verification Team
I can't really say about jmalloc, maybe you should report a bug there?
https://github.com/jemalloc/jemalloc/issues

all best
Bogdan
[29 Aug 2020 9:05] Christos Chatzaras
I am not sure if it was exactly the same issue, but in most servers the RAM usage was higher than the limits from my.cnf. I use MySQL 5.7 and FreeBSD 12.1. FreeBSD 12.1 uses jemalloc 5.1. In one specific server the RAM usage was so high (>40GB) that even all SWAP was used and I had to restart MySQL to avoid OOM.  I change it to tmalloc and the RAM usage now is ~5GB.

More information here:

https://forums.freebsd.org/threads/freebsd-12-x-and-mysql-5-7-and-importing-file-with-lots...
[29 Aug 2020 9:36] MySQL Verification Team
Christos,  for your case I didn't try it but I suspect if you set a limit for:

  --parser-max-mem-size=# 
                      Maximum amount of memory available to the parser

the long query with many values will error out and keep memory stable.
just because max_allowed_packet is 64M it means the query can be that much but to parse that query requires many many more times memory.  Hence this new variable was invented.
[29 Aug 2020 10:50] Christos Chatzaras
I switch back to jemalloc and add to my.cnf:

parser-max-mem-size=8000000000

But usage goes above 8GB:

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
87281 mysql        39  20    0    25G    22G select   7  58:25 178.17% mysqld

FreeBSD 12.1 uses jemalloc 5.1.0, and FreeBSD CURRENT (the development branch) uses 5.2.1. In few days I will setup a staging server and try to see if jemalloc 5.2.1 solves the issue.