Bug #83047 Memory usage gradually increases and brings server to halt
Submitted: 20 Sep 2016 7:40 Modified: 27 Sep 2017 11:37
Reporter: Machiel Richards Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.7 OS:Ubuntu
Assigned to: CPU Architecture:Any
Tags: Memory, swapping

[20 Sep 2016 7:40] Machiel Richards
Description:
MySQL server 5.7 installed with a database size of around 800Gb

MySQL gradually utilizes system memory to the point where there are no system memory or cache available bringing the system to a halt. 
The server currently have 128Gb of ram and SSD drives as storage and even with almost no traffic, memory usage keep on increasing. 

Swap space was added later on to try and mitigate the restart of database servers and was set to the same amount as memory, however mysqld process uses all system memory, then starts to use swap memory up to the point where the database is killed by OOM errors. 

From database restart up to a total shutdown takes about a week , thus database needs to be restarted about once a week. 

memory instruments have been enabled to try and find the cause, however nothing was found to be out of the ordinary. 

using some monitoring commands, it simply states that the mysqld process itself is taking up all the memory. 

Innodb buffer pool size is currently set to about 70% of system memory. 

How to repeat:
Install mysql 5.7 and run for about a week or so with medium amount of traffic (mostly write traffic). 

Suggested fix:
Investigate MySQL background memory allocations
[20 Sep 2016 7:47] Machiel Richards
PS... Swappiness on OS level have been set to 1.
[31 Oct 2016 21:13] Miguel Solorzano
Please provide the my.cnf file (private if you wish). Thanks.
[1 Dec 2016 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[10 Aug 2017 8:34] Peter Bukaowski
Hello

I am observing the same situation on multiple mysql installations running mysql 5.7.19 on ubuntu server 16.04.
[10 Aug 2017 8:45] Machiel Richards
Hi Peter

    I managed to solve this issue for us by enabling a different memory allocation library called jemmalloc. 

    When using the default libraries, the issue comes up, however when enabling jemmalloc libraries,everything became stable again. 

   Steps to follow for this on ubuntu was as follows : 

       - apt-get install libjemalloc1
       - restart mysql service

    You can then use percona toolkit (latest version) to verify that the new memory allocation library is being used. 

    This resolved the issue for me, even though the original bug does not get fixed, but at least a fix to resolve the issues. 

     I have done a whole writeup on the issue for myself, and you are welcome to pm me and I will forward it to you. 

Regards
[10 Aug 2017 11:00] Peter Bukaowski
Hello Machiel

Thanks for quick reply , I will try this . Could you send me doc? I will PM you soon.
[10 Aug 2017 12:31] Peter Bukaowski
HI
Michel I cannot find option to send PM so here is my email :  pzasada@outlook.com

regards
P
[15 Aug 2017 0:01] Marcelo Goncalves
Hi,

I would like to you to test one problem I was able to reproduce and it is completely compatible with the scenario described here.
Dear Mr Machiel Richards, do you use imports of big inserts into database like the ones mysqldump creates?
  (insert into <table> (values),(values)...(values);
Please let me know if so.

How I reproduced a memory issue that can be related to this one.

Startup a new MySQL Instance (5.6 or 5.7, I think both are affected) I used 5.6.36.
Setup only 1 variable in my.cnf: max_allowed_packet=96M

Create a dummy table like this one:

create database db;
  use db;
  create table db.dummy (acol bigint, anything varchar(256)) engine=innodb;

Create a SQL file (load_test.sql) like this one:
INSERT INTO db.dummy VALUES
(1,'This is a dummy value'),
(1,'This is a dummy value'),
...
(1,'This is a dummy value');

Repeat the values until you reach a file size bigger than 64MB. It needs to be bigger than 64MB.

Take note of memory allocation and how it is.

Run the following:
mysql db < load_test.sql

This might cause mysqld memory to explode! Run with bigger file size, bigger max_allowed_packet if needed.
[15 Aug 2017 0:13] Marcelo Goncalves
Hi,

In my case using the LIB did not help. I will open a new one.

# Memory management library ##################################
jemalloc enabled in MySQL config for process with ID 6314
Using jemalloc from /usr/lib64/libjemalloc.so.1
# The End ####################################################
[15 Aug 2017 8:20] Peter Bukaowski
I followed Machiel instructions and so far everything looks good . I deployed fix on two instances and  memory usage seems to be under control after 4 days . Fingers crossed.
[25 Sep 2017 11:01] Athi Nivas
Hi,

We're facing a similar issue like this while using MySQL 5.7.18 Community edition in our CentOS 7 systems(CentOS Linux release 7.3.1611 (Core), Linux 172.20.19.194 3.10.0-514.26.2.el7.x86_64).

After many concurrent connections to the server, there seems a substantial increase in the amount of memory being used by mysql process & eventually results in OS crash due to memory overflow. Think, the memory allocated for each connection(s) are not set to free properly.

So, we tried using the TCMalloc(http://goog-perftools.sourceforge.net/doc/tcmalloc.html) for memory allocation. We found the memory is under control & everything seems normal after that.

As both innodb_use_sys_malloc and innodb_additional_mem_pool_size were deprecated in MySQL 5.6 and removed in MySQL 5.7. I'm not sure which allocator went awful.

Does this a bug in MySQL innodb memory allocator (or) the system allocation went wrong? Shall we proceed with TCMalloc? I believe MySQL itself recommends it as an optional memory-allocator in its documentation.

https://dev.mysql.com/doc/refman/5.7/en/mysqld-safe.html#option_mysqld_safe_malloc-lib

Thanks,
Athi
[26 Sep 2017 11:54] Bogdan Kecman
Hi,
first, I'm getting some confused information here, what binary are we talking about, one provided by dev.mysql.com or one provided by repo or one provided by percona?

With dev.mysql.com binary I'm unable to reproduce this issue. 

all best
Bogdan
[26 Sep 2017 19:59] Athi Nivas
Hi Bogdan,

With this binary,( https://dev.mysql.com/get/Downloads/MySQL-5.7/mysql-5.7.18-linux-glibc2.5-x86_64.tar ) we're getting the similar memory issue. Could you just check with that?? Because, its completely reproducible at our end. 

OS: CentOS Linux release 7.3.1611 (Core)

Thanks,
Athi
[27 Sep 2017 11:37] Bogdan Kecman
Hi,

Thanks for the report, I was trying to reproduce it the wrong way that's why I failed. It is verified now as described. The cue was to do a huge bulk insert, not a bunch of small ones :)

Jump in memory usage is obvious, and is not coming back :(

122738 ?        Sl     2:58    675 23127 1518412 625512 15.5 /home/arhimed/mysql/5.7.19/bin/mysqld
122738 ?        Sl     3:49  52713 23127 4598604 3592692 89.1 /home/arhimed/mysql/5.7.19/bin/mysqld

Thanks
Bogdan
[28 Sep 2017 7:53] Athi Nivas
Hi Bogdan,

Yes. And changing the memory allocator used for memory operations solved the issue. We've tried with TCMalloc as mentioned here ( https://dev.mysql.com/doc/refman/5.7/en/mysqld-safe.html#option_mysqld_safe_malloc-lib ).

Kindly share your thoughts on whether we can proceed with this memory allocator change in our production environment?? is it a safe bet on relying external memory allocator? (or) we should downgrade to a MySQL 5.7 minor version, which doesn't reproduce this issue. 

Much Thanks, 
Athi
[28 Sep 2017 8:03] Bogdan Kecman
Hi,
TCMalloc should be safe for production but this really needs to be your decision or something you comm to agreement with consulting/support team. From the bugs perspective I can only say that it's really a bug, a weird one imo, and that I assume it will be fixed quickly.

all best
Bogdan
[3 Oct 2017 9:23] Athi Nivas
Hi Bogdan, 

Thanks a lot. Please update this thread once the issue is fixed. 

Regards,
Athi
[20 Nov 2017 13:27] Van Stokes
Ubuntu 16.04.03
MySQL: 5.7.20-log (CE; MySQL apt repo)
Replication: 4xMasters Circular, 6xSlaves
Server type: Dedicated; Non-VM
Server physical RAM: 32GB

We hve the same problem. Excessive memory usage over-time (according to TOP, PS, etc) causing SWAP to be utilized. However, I notice that it happens on my backup/slave servers the most. We backup using "mysqldump" and we dump large databases with a large packet size. We do not have performance schema enabled on the slaves.

And, according to vmstat, we are swapping in/out large number of pages.

--- mysqld.cnf ---

[mysqldump]
quick
quote-names
max-allowed-packet = 16M

[mysqld]
innodb-buffer-pool-instances = 8 # one per physical core
innodb-buffer-pool-size = 16G
innodb-log-file-size = 1G
innodb-log-buffer-size = 8M
key-buffer-size = 128M

--- /etc/sysctl.conf ---

vm.swappiness=0
vm.dirty_ratio=5

--- Command line diagnosis ----

root@mysql01:~# ps -aux| grep mysql
mysql     2097 26.3 57.0 23458648 18808512 ?   Sl   Nov12 2996:38 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid

root@india-mysql01:~# free -h
              total        used        free      shared  buff/cache   available
Mem:            31G         18G        3.6G        5.4M        9.3G         12G
Swap:           15G        1.5G         14G

root@india-mysql01:~# service mysql stop
root@india-mysql01:~# free -h
              total        used        free      shared  buff/cache   available
Mem:            31G        304M         21G        5.4M        9.4G         30G
Swap:           15G        328M         15G

root@india-mysql01:~# service mysql start
root@india-mysql01:~# free -h
              total        used        free      shared  buff/cache   available
Mem:            31G        4.8G         17G        5.4M        9.2G         26G
Swap:           15G        328M         15G

Why was MySQL causing ~1.5G of SWAP to utilized?
[7 Dec 2017 15:15] Max Xu
Hi Micheal,

My name is Max, and my company seem to have the same problem here, would you please send me a copy of your investigation report? I would really appreciate your help, Thanks!

max.xu@live.ca
[21 May 2018 7:46] Bin Su
Posted by developer:
 
Hi Sachin,

I pinged the ACS who reported the memory leak by running sysbench. Some points:
1. He ran sysbench on customer's server - SUSE.
2. He ran with 10 tables.
3. Let's say the buffer pool is about 10G, they prepare the sysbench(so only inserts) to make the tables' size become a bit more than BP size and stop sysbench.
4. It looks like so far so good. However, after about half an hour, he observed that the RSS for mysql server increased - this is why they thought the problem was reproduced.

So could you please try the same with sysbench(not only 1 table as you did last time) and observe the RSS? I guess supports have provided enough information(software version, etc.) about the repro?
[13 Dec 2018 3:09] 镇熙 林
Hi Micheal,

We seem to have the same problem here, would you please send me a copy of your investigation report? I would really appreciate your help, Thanks!

linzx@sdc.icbc.com.cn
[26 Mar 13:59] DB Support
Same problem here:
- Redhat 7.3 ( 3.10.0-514.el7.x86_64)
- MySQL 5.6.31
- Innodb_buffer_pool set to 60Gb (128Gb of Total RAM).

Hugepages set, THP disable and Swappiness to 1:
[root@RJBAR-ASTHPP-SDB01 astro_platform]# cat /etc/sysctl.conf
kernel.shmmax=96636764160
kernel.shmall=23592960
vm.nr_hugepages=32000
vm.hugetlb_shm_group=492
vm.dirty_background_ratio=3
vm.dirty_ratio=40
vm.dirty_expire_centisecs=500
vm.dirty_writeback_centisecs=100
vm.swappiness=1
vm.zone_reclaim_mode=1 

TCMalloc library in use:
[root@RJBAR-ASTHPP-SDB01 lib64]# lsof /usr/lib64/libtcmalloc.so.4
COMMAND     PID  USER  FD   TYPE DEVICE SIZE/OFF    NODE NAME
mysqld.bi 50154 mysql mem    REG  253,0   301080 1062225 /usr/lib64/libtcmalloc.so.4.4.5

Numa policy set to interleave to mysqld PID:
[root@RJBAR-ASTHPP-SDB01 lib64]#  numastat mysqld -v 

Per-node process memory usage (in MBs) for PID 50154 (mysqld.bin)
                           Node 0          Node 1           Total
                  --------------- --------------- ---------------
Huge                     31460.00        31458.00        62918.00
Heap                     26728.84          904.89        27633.73
Stack                       14.33            0.17           14.50
Private                      6.79            5.35           12.14
----------------  --------------- --------------- ---------------
Total                    58209.96        32368.41        90578.37

Tasks: 416 total,   1 running, 415 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  0.1 sy,  0.0 ni, 99.1 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 13182334+total,   368512 free, 95328440 used, 36126392 buff/cache
KiB Swap: 16777212 total, 15903748 free,   873464 used. 34686992 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                                                   
50154 mysql     20   0 89.737g 0.026t   8624 S  23.3 21.5  60234:38 mysqld.bin  

[root@RJBAR-ASTHPP-SDB01 lib64]# free -m
              total        used        free      shared  buff/cache   available
Mem:         128733       93101         356        1088       35276       33867
Swap:         16383         852       15531

Regards,
Hugo