Bug #91710 | after upgrading from 5.6.32 to 5.7.22(jemalloc-5.1), memory usage keeps growing | ||
---|---|---|---|
Submitted: | 19 Jul 2018 6:21 | Modified: | 8 Oct 2018 14:09 |
Reporter: | Suantai Rousi | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S1 (Critical) |
Version: | 5.7.22,5.7.23 | OS: | Red Hat (6.5) |
Assigned to: | MySQL Verification Team | CPU Architecture: | x86 (64bit) |
Tags: | jemalloc-5.1.0, M-M replication, Memory Growing, MTS, swap |
[19 Jul 2018 6:21]
Suantai Rousi
[19 Jul 2018 6:23]
Suantai Rousi
my.cnf
Attachment: my.cnf (application/octet-stream, text), 4.81 KiB.
[19 Jul 2018 6:23]
Suantai Rousi
os memory
Attachment: OS_memory.txt (text/plain), 3.33 KiB.
[19 Jul 2018 6:23]
Suantai Rousi
sys memory results
Attachment: mysql_sys_memory.txt (text/plain), 39.09 KiB.
[19 Jul 2018 6:24]
Suantai Rousi
global variables
Attachment: global_variables.txt (text/plain), 14.83 KiB.
[19 Jul 2018 6:24]
Suantai Rousi
global status
Attachment: global_status.txt (text/plain), 36.19 KiB.
[19 Jul 2018 6:28]
Suantai Rousi
The secondary node(ugc216), which consumes more memory than the primary node, has used 50% of the swap.
[23 Jul 2018 11:35]
MySQL Verification Team
Hi, This does look like a bug but there's not enough info to determine where the memory is lost. If you restart the server memory usage should return to normal. Can you see what actions are performed on the server when the ram start to be used and not released? all best Bogdan
[24 Jul 2018 6:39]
Suantai Rousi
Thank you for your reply. Last Thursday, the swap usage of the secondary node suddenly increased rapidly to 70%. To prevent OOM, I restarted the secondary node. Interestingly,after that,the memory usage of the primary node also slowly declined,and it is still Continue to decline. In normal times, the number of connections and active sessions in this database are below 50. After upgrading to 5.7.22, the client has switched over the connection. The front-end application briefly connects to the secondary node, After that, it has always been connected to the primary node for external services.
[27 Jul 2018 19:33]
MySQL Verification Team
Hi, Is there a chance you, for a short while, remove all connections from the affected node and monitor - connections (see if all connections are closed) - memory usage (does the memory start getting released back to os) thanks Bogdan
[28 Aug 2018 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[4 Sep 2018 2:09]
Suantai Rousi
Last friday, we restarted two mysqld and the new configuration file has been uploaded. Until today, the master node is working fine, but the secondary node begins to reproduce the previous phenomenon, memory availability has been declining (cached keeps dropping, anonymous pages keep rising).
[4 Sep 2018 2:10]
Suantai Rousi
master cnf
Attachment: master.cnf (application/octet-stream, text), 4.85 KiB.
[4 Sep 2018 2:10]
Suantai Rousi
slave cnf
Attachment: slave.cnf (application/octet-stream, text), 4.89 KiB.
[4 Sep 2018 2:10]
Suantai Rousi
slave mysql overview
Attachment: overview.png (image/png, text), 77.54 KiB.
[4 Sep 2018 2:11]
Suantai Rousi
slave memory
Attachment: memory-slave.png (image/png, text), 30.29 KiB.
[4 Sep 2018 2:12]
Suantai Rousi
slave memory detail
Attachment: memorydetail-slave.png (image/png, text), 122.00 KiB.
[4 Sep 2018 2:22]
Suantai Rousi
master global status
Attachment: status-master.txt (text/plain), 35.41 KiB.
[4 Sep 2018 2:22]
Suantai Rousi
slave global status
Attachment: status-slave.txt (text/plain), 35.41 KiB.
[4 Sep 2018 2:34]
Suantai Rousi
master node: ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 20063 0.0 0.0 106228 1508 ? S Aug31 0:00 /bin/sh /usr/local/mysql5722/bin/mysqld_safe --datadir=/data/mysql --pid-file=/data/mysql/mysql3306.pid mysql 23315 114 62.9 58501960 41523640 ? Sl Aug31 7056:33 /usr/local/mysql5722/bin/mysqld --basedir=/usr/local/mysql5722 --datadir=/data/mysql --plugin-dir=/usr/local/mysql5722/lib/plugin --user=mysql --log-error=/log/mysql/my3306.err --open-files-limit=65535 --pid-file=/data/mysql/mysql3306.pid --socket=/tmp/mysql.sock --port=3306 free -mt total used free shared buffers cached Mem: 64375 61330 3044 0 666 18382 -/+ buffers/cache: 42280 22094 Swap: 30719 6 30713 Total: 95095 61336 33758 seconde node: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 27323 0.0 0.0 106228 1508 ? S Aug31 0:00 /bin/sh /usr/local/mysql5722/bin/mysqld_safe --datadir=/data/mysql --pid-file=/data/mysql/mysql3306.pid mysql 30631 5.3 66.2 81011056 43668620 ? Sl Aug31 328:57 /usr/local/mysql5722/bin/mysqld --basedir=/usr/local/mysql5722 --datadir=/data/mysql --plugin-dir=/usr/local/mysql5722/lib/plugin --user=mysql --log-error=/log/mysql/my3306.err --open-files-limit=65535 --pid-file=/data/mysql/mysql3306.pid --socket=/tmp/mysql.sock --port=3306 free -mt total used free shared buffers cached Mem: 64375 61329 3045 0 574 16551 -/+ buffers/cache: 44204 20171 Swap: 30719 0 30719 Total: 95095 61329 33765
[4 Sep 2018 7:29]
Suantai Rousi
performance schema memory summary by events
Attachment: P_S-summary.txt (text/plain), 7.63 KiB.
[5 Sep 2018 3:01]
Suantai Rousi
In addition, in version 5.7, jemalloc-5.1.0 was used.
[15 Sep 2018 5:16]
Suantai Rousi
Event:sql/Log_event uses memory to rise indefinitely, is it normal? This issue will occur in the most recent version 5.7.22~23. [HTML] select THREAD_ID, EVENT_NAME, COUNT_ALLOC, SUM_NUMBER_OF_BYTES_ALLOC/1024/1024, CURRENT_NUMBER_OF_BYTES_USED/1024/1024 from performance_schema.memory_summary_by_thread_by_event_name order by CURRENT_NUMBER_OF_BYTES_USED desc limit 10; +-----------+----------------------------+-------------+-------------------------------------+----------------------------------------+ | THREAD_ID | EVENT_NAME | COUNT_ALLOC | SUM_NUMBER_OF_BYTES_ALLOC/1024/1024 | CURRENT_NUMBER_OF_BYTES_USED/1024/1024 | +-----------+----------------------------+-------------+-------------------------------------+----------------------------------------+ | 1 | memory/innodb/buf_buf_pool | 240 | 31455.00000000 | 31455.00000000 | | 507 | memory/sql/Log_event | 18400264 | 7564.47873688 | 7564.39571095 | | 1 | memory/innodb/hash0hash | 92 | 2356.76754761 | 897.66287231 | | 1 | memory/innodb/os0event | 5899209 | 765.12567902 | 765.12567902 | | 506 | memory/innodb/std | 38099468 | 1599.35388184 | 193.64311218 | | 1 | memory/innodb/log0log | 11 | 64.01335907 | 64.01335907 | | 508 | memory/innodb/std | 8876486 | 396.79184723 | 38.66751099 | | 507 | memory/innodb/std | 1776900 | 74.58917236 | 9.03167725 | | 509 | memory/mysys/IO_CACHE | 6 | 8.03128242 | 8.03128242 | | 508 | memory/mysys/IO_CACHE | 6 | 8.03128242 | 8.03128242 | +-----------+----------------------------+-------------+-------------------------------------+----------------------------------------+ 10 rows in set (0.03 sec) select thread_id, user, current_allocated, total_allocated from sys.memory_by_thread_by_current_bytes; +-----------+---------------------------------+-------------------+-----------------+ | thread_id | user | current_allocated | total_allocated | +-----------+---------------------------------+-------------------+-----------------+ | 1 | sql/main | 32.42 GiB | 34.08 GiB | | 507 | sql/slave_sql | 7.42 GiB | 12.10 GiB | | 506 | sql/slave_io | 198.30 MiB | 100.58 GiB | | 38914 | root@localhost | 8.53 MiB | 21.64 MiB | | 510 | sql/slave_worker | 8.06 MiB | 8.21 MiB | | 511 | sql/slave_worker | 42.74 KiB | 42.90 KiB | | 39514 | root@10.25.150.203 | 16.84 KiB | 28.59 MiB | | 39515 | root@10.25.150.203 | 16.84 KiB | 40.66 MiB | | 39516 | root@10.25.150.203 | 16.84 KiB | 24.54 MiB | | 3225 | repl@10.25.150.202 | 13.05 KiB | 63.97 KiB | | 31 | innodb/buf_dump_thread | 192 bytes | 1.74 MiB | | 30 | innodb/srv_worker_thread | 0 bytes | 0 bytes | | 39 | sql/signal_handler | 0 bytes | 0 bytes | | 40 | sql/compress_gtid_table | 0 bytes | 0 bytes | | 2 | sql/thread_timer_notifier | 0 bytes | 0 bytes | | 3 | innodb/io_read_thread | 0 bytes | 33.24 KiB | | 4 | innodb/io_ibuf_thread | 0 bytes | 0 bytes | | 5 | innodb/io_read_thread | 0 bytes | 27.16 KiB | | 6 | innodb/io_log_thread | 0 bytes | 0 bytes | | 7 | innodb/io_read_thread | 0 bytes | 5.67 KiB | | 8 | innodb/io_read_thread | 0 bytes | 50.03 KiB | | 9 | innodb/io_read_thread | 0 bytes | 44.77 KiB | | 10 | innodb/io_read_thread | 0 bytes | 11.35 KiB | | 11 | innodb/io_read_thread | 0 bytes | 22.70 KiB | | 12 | innodb/io_read_thread | 0 bytes | 22.70 KiB | | 13 | innodb/io_write_thread | 0 bytes | 1.68 MiB | | 14 | innodb/io_write_thread | 0 bytes | 384.44 KiB | | 15 | innodb/io_write_thread | 0 bytes | 650.33 KiB | | 16 | innodb/io_write_thread | 0 bytes | 223.01 KiB | | 17 | innodb/io_write_thread | 0 bytes | 294.46 KiB | | 18 | innodb/io_write_thread | 0 bytes | 218.54 KiB | | 19 | innodb/io_write_thread | 0 bytes | 274.54 KiB | | 20 | innodb/io_write_thread | 0 bytes | 242.21 KiB | | 21 | innodb/page_cleaner_thread | 0 bytes | 0 bytes | | 23 | innodb/srv_error_monitor_thread | 0 bytes | 0 bytes | | 24 | innodb/srv_lock_timeout_thread | 0 bytes | 0 bytes | | 25 | innodb/srv_monitor_thread | 0 bytes | 18.92 MiB | | 26 | innodb/srv_master_thread | 0 bytes | 14.24 MiB | | 27 | innodb/srv_worker_thread | 0 bytes | 0 bytes | | 28 | innodb/srv_purge_thread | 0 bytes | 0 bytes | | 29 | innodb/srv_worker_thread | 0 bytes | 0 bytes | | 32 | innodb/dict_stats_thread | -160 bytes | 218.36 KiB | | 509 | sql/slave_worker | -5122994 bytes | 58.06 MiB | | 508 | sql/slave_worker | -7894888734 bytes | 28.29 GiB | +-----------+---------------------------------+-------------------+-----------------+ 44 rows in set (0.28 sec) [/HTML]
[17 Sep 2018 15:31]
MySQL Verification Team
Hi, I need you to answer few questions if possible. How much time it takes for the sql/Log_event to jump to 7G ? Do you, on master, have queries that explicitly create temporary tables, and how often? Do you, on master, load data infile, to tables, and how often? What client you use on master? How many connections you have on slave? thanks Bogdan
[20 Sep 2018 5:59]
Suantai Rousi
Thank you for your reply. All problematic instances,using jemalloc-5.1.0 and MTS question 1:How much time it takes for the sql/Log_event to jump to 7G ? answer 1:3~5days question 2:Do you, on master, have queries that explicitly create temporary tables, and how often? answer 2:no explicitly create temporary tables question 3:Do you, on master, load data infile to tables, and how often? answer 3:no load data infile question 4:What client you use on master? answer 4:jdbc question 5:How many connections you have on slave? answer 5:The secondary node is only used for emergency backup, no any service connection.
[3 Oct 2018 9:36]
MySQL Verification Team
Hi, I still can't reproduce this. Can you reproduce this without jemalloc? Can you reproduce this with Oracle made binary? I can't verify this with any of the Oracle binaries I tested thanks Bogdan
[3 Oct 2018 17:32]
MySQL Verification Team
btw, since you have no user connections on the slave machine that goes haywire, can you drop the thread_cache_size to zero to see if that changes anything. thanks b.
[8 Oct 2018 13:52]
Suantai Rousi
Thanks for your reply. The service is stable after removing jemalloc-5.1.0. Does this mean that it is an unknown bug in jemalloc-5.1.0?
[8 Oct 2018 14:09]
MySQL Verification Team
I can't really say about jmalloc, maybe you should report a bug there? https://github.com/jemalloc/jemalloc/issues all best Bogdan
[29 Aug 2020 9:05]
Christos Chatzaras
I am not sure if it was exactly the same issue, but in most servers the RAM usage was higher than the limits from my.cnf. I use MySQL 5.7 and FreeBSD 12.1. FreeBSD 12.1 uses jemalloc 5.1. In one specific server the RAM usage was so high (>40GB) that even all SWAP was used and I had to restart MySQL to avoid OOM. I change it to tmalloc and the RAM usage now is ~5GB. More information here: https://forums.freebsd.org/threads/freebsd-12-x-and-mysql-5-7-and-importing-file-with-lots...
[29 Aug 2020 9:36]
MySQL Verification Team
Christos, for your case I didn't try it but I suspect if you set a limit for: --parser-max-mem-size=# Maximum amount of memory available to the parser the long query with many values will error out and keep memory stable. just because max_allowed_packet is 64M it means the query can be that much but to parse that query requires many many more times memory. Hence this new variable was invented.
[29 Aug 2020 10:50]
Christos Chatzaras
I switch back to jemalloc and add to my.cnf: parser-max-mem-size=8000000000 But usage goes above 8GB: PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 87281 mysql 39 20 0 25G 22G select 7 58:25 178.17% mysqld FreeBSD 12.1 uses jemalloc 5.1.0, and FreeBSD CURRENT (the development branch) uses 5.2.1. In few days I will setup a staging server and try to see if jemalloc 5.2.1 solves the issue.