Bug #97935 | Memory leak in client connection using information_schema | ||
---|---|---|---|
Submitted: | 10 Dec 2019 2:01 | Modified: | 21 Jan 2021 13:09 |
Reporter: | Daniel Nichter (OCA) | Email Updates: | |
Status: | Won't fix | Impact on me: | |
Category: | MySQL Server: Information schema | Severity: | S2 (Serious) |
Version: | 5.7.28 | OS: | Any |
Assigned to: | MySQL Verification Team | CPU Architecture: | x86 |
Tags: | 5.7, client, connect, disconnect, information_schema, massif, memory-leak, valgrind, vio |
[10 Dec 2019 2:01]
Daniel Nichter
[10 Dec 2019 13:01]
MySQL Verification Team
Hi, I was testing something similar recently but could not reliably reproduce. Can you please give me your config file. Also, you stop the test, disconnect all the clients, wait few minutes and the ram is not freed right? thanks
[10 Dec 2019 13:32]
Daniel Nichter
my.cnf
Attachment: my.cnf (application/octet-stream, text), 6.44 KiB.
[10 Dec 2019 13:33]
Daniel Nichter
SHOW GLOBAL VARIABLES
Attachment: global_vars.txt (text/plain), 15.19 KiB.
[10 Dec 2019 13:43]
Daniel Nichter
Hi Bogdan, Added config file and SHOW GLOBAL VARIABLES output. Yes, when I stop my test program, all the test connections disconnect, but the memory is never freed. I've tried many things to get MySQL to free some memory (without restarting MySQL), but nothing works. We have many real databases exhibiting this issue and, so far, none of them free the memory, even when we failover off a node to make it read-only and inactive (no connections apart from replication and periodic monitoring). The only workaround is restarting MySQL. Thanks!
[10 Dec 2019 13:45]
MySQL Verification Team
Thanks, all clear. Let me try to reproduce. You don't have any events running? all best
[10 Dec 2019 13:54]
Daniel Nichter
No events. My test instance was newly provisioned, so it has no real data or usage. Thanks for looking into this! Let me know if you need more info or want me to run other experiments on my side.
[10 Dec 2019 16:24]
MySQL Verification Team
Hi Daniel, I'm getting mysql to increase ram usage and not return ram but I cannot get it to allocate more ram than is expected, I reach the "cap" and it does not allocate over that cap. Did you manage to get it to OOM or ?
[10 Dec 2019 17:06]
Daniel Nichter
Hi Bogdan, We don't OOM kill MySQL. Instead, on certain hosts running many MySQL instances, MySQL will eventually use all RAM which causes swapping then high system load then (sometimes) the system becomes unresponsive (unless we catch it ahead of time, fail over, and restart all the MySQL on the host). What cap are you reaching? How much do you expect it to allocate max? (What's your setup and results in general?)
[10 Dec 2019 17:41]
MySQL Verification Team
Not reproducing :( hitting slave with 100, 200, 250, 500, 1000, 200, 200, 500 threads, each thread doing approx 10000 selects (simple PK ones, range, long range etc etc..) and the ram does increase while slave is hit but up to a point, and when I stop, it drops down (of course, not totally, but to a point) .. PID %MEM RSZ COMMAND 38213 9.6 387600 mysqld 38213 14.1 566580 mysqld 38213 14.2 571596 mysqld 38213 14.2 571596 mysqld 38213 14.3 573720 mysqld 38213 14.3 573720 mysqld 38213 14.3 573720 mysqld 38213 14.3 573720 mysqld 38213 14.3 573720 mysqld 38213 14.3 573936 mysqld 38213 14.3 573808 mysqld 38213 14.3 574656 mysqld 38213 14.3 574528 mysqld 38213 14.3 573976 mysqld 38213 14.3 574504 mysqld 38213 14.3 573800 mysqld 38213 14.3 573800 mysqld 38213 14.3 573800 mysqld
[10 Dec 2019 17:43]
MySQL Verification Team
this is all connecting trough socket, I'm running a test now connecting to the network port to see if that makes a difference
[10 Dec 2019 20:16]
Daniel Nichter
I think the issue might be Docker. I managed to un-containerize my test MySQL instance and run it outside Docker and I could not reproduce the issue. Outside Docker, MySQL behaves as expected: some memory is allocated, but when the test stops, RSS stops increasing. There's about 35% memory increase, which I'd call normal. Put it back in Docker and the issue is easily reproducible. Given that Valgrind attributes the memory allocation to mysqld, it suggests that Docker is interfering with low-level memory allocation. There's >150% memory increase. We can put this bug report on hold while I continue to test.
[10 Dec 2019 20:36]
MySQL Verification Team
Hi, I understood you can reproduce this outside of docker too. Running this for few hours and .. I don't see any invalid ram allocations. RAM usage is stable and drops down after connections are released. Tried both with prepared statements and without, both trough socket and port, from 200 to 1500 connections in the same time with all type of read only queries. last run: 38213 14.3 573800 mysqld 38213 14.3 573920 mysqld 38213 14.3 574504 mysqld 38213 14.3 574608 mysqld 38213 14.3 574464 mysqld 38213 14.3 574568 mysqld 38213 14.3 574480 mysqld 38213 14.3 574616 mysqld 38213 14.3 574504 mysqld 38213 14.3 574664 mysqld 38213 14.3 574504 mysqld 38213 14.3 574576 mysqld 38213 14.3 574448 mysqld 38213 14.3 574704 mysqld 38213 14.3 574528 mysqld 38213 14.3 574600 mysqld 38213 14.3 574512 mysqld 38213 14.3 574600 mysqld 38213 14.3 574488 mysqld 38213 14.3 574632 mysqld 38213 14.3 574520 mysqld 38213 14.3 574608 mysqld 38213 14.3 574568 mysqld 38213 14.3 574704 mysqld 38213 14.3 574528 mysqld 38213 14.3 574504 mysqld 38213 14.3 574608 mysqld 38213 14.3 574696 mysqld 38213 14.3 574640 mysqld 38213 14.3 574432 mysqld 38213 14.3 574568 mysqld 38213 14.3 574480 mysqld 38213 14.3 574600 mysqld 38213 14.3 574472 mysqld 38213 14.3 573856 mysqld The docker issue on the other hand is probbly some docker bug. all best
[10 Dec 2019 20:50]
Daniel G.
We definitely saw a memory leak with this release and needed to pin back to a previous package version. This was on Ubuntu 16.04 using a localhost MYSQL server and client libs to support Kamailio Sip Server instances.
[11 Dec 2019 14:41]
Daniel G.
In our case it was the client lib which exhibited the memory leak, restarting the client process (kamailio) resolved the issue. Left unchecked it would crash OOM, core dumps pointed to libmysqlclient.so.20. Increased load (queries) accelerated the memory leak. Downgrading to 5.7.11 (only available downgrade) was the final resolution. Also I believe it was not present in 5.7.26. This was on 4 different AWS instances running Ubuntu 16.0.4LTS accessing independent localhost mysql DBs.
[13 Dec 2019 20:50]
Daniel Nichter
Hi again Bogdan, I've isolated this to the information_schema. Previously, I mentioned that memory usage continued to grow even after I stopped my test. This seemed really weird, so I thought more about it and realized: my test instance was not fully isolated. We have monitoring running every 30s. When I stopped the monitor, the memory usage growth stopped, too. So I looked into what the monitoring was doing and I narrowed it down to information_schema queries. There are two ways to reproduce using a standard 5.7.28 Community Server build (running on bare metal--no Docker): Method 1) 1. Start mysqld 2. Start monitor.sh (which I'll upload) 3. Start logging RSS (which I'll also upload) 4. Warm of MySQL with a sequence of clients and QPS like originally described I did a 2h test and grew RSS from 175 MB to 1.6 GB (will upload 2h-rss-mysqld-direct.csv). This was a completed isolated mysqld process, so nothing but the monitoring queries running, querying information_schema. Method 2) 1. Start mysqld 2. Start logging RSS (which I'll also upload) 3. Warm of MySQL with a sequence of clients and QPS like originally described 4. When 4 stops, wait 15-20 minutes for RSS to stabilize, then... 5. Exec "SHOW GLOBAL STATUS" every 30s 6. Watch RSS grow. 7. Stop 5 and wait abotu 15 min for RSS to stabilize again. For method 2, see method-2-rss-graph.png that I'll upload. As far as I can tell from the source code, SHOW GLOBAL STATUS is implemented as an information_schema.global_status query under the hood, which explains why it causes memory usage to increase, too. I started step 5 about 1.5 hours later and it caused RSS to increase even more until I stopped it again. (Not shown in the graph uploaded.) -- I definitely think there's a memory leak here. Especially my 2 hour method 1 run, there's no reason MySQL should have allocated 1.6G. And it's an isolated mysqld instance on bare metal, so it's definitely not a Docker issue. Hope this helps reproduce it on your end!
[13 Dec 2019 20:54]
Daniel Nichter
monitor.sh
Attachment: monitor.sh (text/x-sh), 333 bytes.
[13 Dec 2019 20:55]
Daniel Nichter
Log changes to RSS of given PID
Attachment: record-rss.sh (text/x-sh), 361 bytes.
[13 Dec 2019 20:56]
Daniel Nichter
2h output of record-rss.sh from running monitor.sh
Attachment: 2h-rss-mysqld-direct.csv (text/csv), 14.83 KiB.
[13 Dec 2019 20:58]
Daniel Nichter
Graph of mysqld RSS from running method 2
Attachment: method-2-rss-graph.png (image/png, text), 101.26 KiB.
[13 Dec 2019 23:49]
MySQL Verification Team
Hi, Thanks for the test case, I'll retest first thing Monday morning. No replication in this one? Single server required for reproduction?
[15 Dec 2019 16:29]
Daniel Nichter
Great, thanks! -- Yes, just a standalone server, no replication.
[2 Jan 2020 19:29]
Daniel Nichter
Any success reproducing this?
[16 Jan 2020 0:54]
MySQL Verification Team
Hi, Thanks for the test case, verified!
[16 Jan 2020 15:25]
Daniel Nichter
Excellent, thank you Bogdan! I think fixing this bug will be a big win for MySQL because information_schema queries and SHOW GLOBAL STATUS are ubiquitous.
[17 Jun 2020 6:41]
Demon Chen
I might meet the same situation. We deployed 4 instances on one server, each has 12G ibp size (total ram is 128G). We also have show global status and some other queries which access i_f schema. Valgrind Massif heap profiler attributes the allocation to the call stack around mysql_socket_vio_new 100.00% (40,373,030,912B) (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc. ->99.82% (40,301,461,504B) 0x61122DA: mmap (in /lib64/libc-2.12.so) | ->38.06% (15,367,929,856B) 0x60A42E0: new_heap (in /lib64/libc-2.12.so) | | ->21.61% (8,724,152,320B) 0x60A4ADD: arena_get2 (in /lib64/libc-2.12.so) | | | ->21.61% (8,724,152,320B) 0x60A7793: malloc (in /lib64/libc-2.12.so) | | | ->20.11% (8,120,172,544B) 0xF36597: my_raw_malloc (my_malloc.c:191) | | | | ->20.11% (8,120,172,544B) 0xF36597: my_malloc (my_malloc.c:54) | | | | ->20.11% (8,120,172,544B) 0x12BA2F6: mysql_socket_vio_new (vio.c:274) | | | | ->20.11% (8,120,172,544B) 0xE9E53A: Channel_info::create_thd() (channel_info.cc:28) | | | | ->20.11% (8,120,172,544B) 0xDDED3D: Channel_info_tcpip_socket::create_thd() (socket_connection.cc:222) | | | | ->20.11% (8,120,172,544B) 0xDDBD70: init_new_thd (connection_handler_per_thread.cc:176) | | | | ->20.11% (8,120,172,544B) 0xDDBD70: handle_connection (connection_handler_per_thread.cc:257) | | | | ->20.11% (8,120,172,544B) 0xFA04B0: pfs_spawn_thread (pfs.cc:2190) | | | | ->20.11% (8,120,172,544B) 0x4C2A9D0: start_thread (in /lib64/libpthread-2.12.so) | | | | ->20.11% (8,120,172,544B) 0x61158FC: clone (in /lib64/libc-2.12.so) | | | | ->24.27% (9,797,894,144B) 0x60A4223: new_heap (in /lib64/libc-2.12.so) | | ->20.78% (8,388,608,000B) 0x60A4ADD: arena_get2 (in /lib64/libc-2.12.so) | | | ->20.61% (8,321,499,136B) 0x60A7793: malloc (in /lib64/libc-2.12.so) | | | | ->19.95% (8,053,063,680B) 0xF36597: my_raw_malloc (my_malloc.c:191) | | | | | ->19.95% (8,053,063,680B) 0xF36597: my_malloc (my_malloc.c:54) | | | | | ->19.95% (8,053,063,680B) 0x12BA2F6: mysql_socket_vio_new (vio.c:274) | | | | | ->19.95% (8,053,063,680B) 0xE9E53A: Channel_info::create_thd() (channel_info.cc:28) | | | | | ->19.95% (8,053,063,680B) 0xDDED3D: Channel_info_tcpip_socket::create_thd() (socket_connection.cc:222) | | | | | ->19.95% (8,053,063,680B) 0xDDBD70: init_new_thd (connection_handler_per_thread.cc:176) | | | | | ->19.95% (8,053,063,680B) 0xDDBD70: handle_connection (connection_handler_per_thread.cc:257) | | | | | ->19.95% (8,053,063,680B) 0xFA04B0: pfs_spawn_thread (pfs.cc:2190) | | | | | ->19.95% (8,053,063,680B) 0x4C2A9D0: start_thread (in /lib64/libpthread-2.12.so) | | | | | ->19.95% (8,053,063,680B) 0x61158FC: clone (in /lib64/libc-2.12.so)
[12 Oct 2020 16:26]
MySQL Verification Team
Hi Daniel, Demon, We are trying to fix this bug but having issues with reproduction. Using the provided procedure I do see increase in ram usage but I'm not reproducing the valgrind output 1. master slave 5.7.28, slave running with valgrind --tool=massif 2. on master I'm running read write on 40 threads 3. on slave I'm running read on 8 threads 4. on slave I'm running monitor script and I'm collecting ram usage
[13 Oct 2020 13:25]
MySQL Verification Team
Hi Daniel, Demon, We believe this bug report has 2 different issues reported 1. connecting and disconnecting from slave increases ram usage. 2. information_schema / show global.. increases ram usage. now, the #1 I reproduced but it is capped, the increase in ram will stop after a while and it does not look as a big problem. the #2 on the other hand I'm not reproducing and my colleague from the sustaining team can't reproduce it neither. So, for the second (show global / IS ) issue we would need tad more data. 1. is this also on the slave or this works on stand alone mysqld also? or master or ? 2. can you give us a complete config of the setup so we can try to duplicate it fully thanks Bogdan
[13 Nov 2020 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[16 Nov 2020 10:23]
Demon Chen
Hi, I used valgrind to monitor the master node. We found the master node grow up faster than the slave node.Another way to test is : you can create 10000 tables in on instance,and most of them has no data or very few data. Then you can select * from information_schema.tables and to monitor the ram usage.
[23 Nov 2020 13:59]
MySQL Verification Team
Hi Demon, Can you give us complete config file for your system where you reproduced this as three of us tried to reproduce this using exact procedure you described and we do not see any growth of the ram usage thanks Bogdan
[2 Dec 2020 20:55]
Kenny Gryp
Hello Daniel &|| Demon, While we were able to reproduce this at times in the past, we are not able to reproduce it any longer. I tried reproducing with Daniel's instructions and provided configuration & monitoring scripts, but I fail to reproduce the problem. There's a part in the instructions that say "create some workload and run it with N threads for M time". This could be a lot of different things. What type of workload? Which tables? How many tables? We have quite a lot of people having it marked "impact on me" and we want to solve this memory leak, but we're kind of stuck right now. Can the full steps to create this workload be uploaded as well? Or alternatively for Demon to provide the test that he ran with 10.000 tables to get into this issue. Regards, Kenny
[3 Dec 2020 14:03]
MySQL Verification Team
Hi, One thing I never asked that's maybe relevant - can you reproduce this on 8.0.22? As for 5.7 I have one in production that has over 10k tables and where I select from I_S often (around 1000x per day, every day) and I have no RAM issues there. Can you help us with the creation of a test case?
[3 Dec 2020 23:04]
MySQL Verification Team
Hi, Summarized - the important questions we need an answer to: 1. Have you tried and have you reproduced on 8.0 2. Are you are doing other I_S queries other than "select * from information_schema.tables"? 3. The "exact" number of tables you have on the system where you are reproducing this. 4. You said "just do...", can you give us a procedure to reproduce "every time" * install 5.7 * set config to xyz * execute script x Thanks
[24 Dec 2020 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[20 Jan 2021 5:28]
MySQL Verification Team
Hi, Setting this as "not a bug" as: I_S in 5.7 materializes the data into memory and then serves it to the client from there. If the materialized is large, it spills onto disk. But if the tables are just the right size the memory consumption on the server side will indeed grow and grow. This is not going to be changed, if you are affected you should upgrade to 8. I_S is rewritten for 8.0, so please verify in 8.0 and if it repeats then please re-open the bug with the detailed information that we requested. thanks Bogdan
[21 Jan 2021 9:27]
Roel Van de Paar
Has anyone tried with ASAN or UBSAN yet? Set these options before doing so export ASAN_OPTIONS=quarantine_size_mb=512:atexit=1:detect_invalid_pointer_pairs=3:dump_instruction_bytes=1:abort_on_error=1 export UBSAN_OPTIONS=print_stacktrace=1
[28 Jan 2021 7:35]
MySQL Verification Team
I can get 5.7.33 mysqld to grow up to 1.8G but it flattens out there. Then, same test, but started mysqld like this : LD_PRELOAD=/lib64/libtcmalloc.so ./bin/mysqld --performance-schema=off and the growth of RES memory stops at about 269M which is expected. Can somebody he experiences the symptoms please try jemalloc or tcmalloc ?
[28 Jan 2021 10:07]
MySQL Verification Team
Enabling the thread_pool plugin with thread_pool_size=8 (i have 8 cores) seemed to help keep memory lower for longer. Maybe reproducibility of this "bug" depends on hardware, aka number of cpu cores and glibc's behavior with regard to arenas, heaps etc? When using glibc malloc, if i first ran: MALLOC_ARENA_MAX=8 ./bin/mysqld ... then the memory was also kept lower. https://www.gnu.org/software/libc/manual/html_node/Memory-Allocation-Tunables.html Docs say: "The glibc default number of memory pools on 64bit systems is 8 times the number of CPU cores" Possible workarounds to try: ----------------------------- o) Lower MALLOC_ARENA_MAX o) switch to jemalloc/tcmalloc o) use enterprise thread_pool plugin.
[29 Jan 2021 13:12]
MySQL Verification Team
a comment to trigger notification. any of the commenters/reporters let us know if the mentioned workarounds helped in previous two comments.