Bug #94288 Memory leak Innodb Cluster
Submitted: 12 Feb 2019 9:54 Modified: 21 Feb 2019 0:36
Reporter: setra user Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:8.0.13 OS:Debian (9)
Assigned to: MySQL Verification Team CPU Architecture:x86
Tags: allocate, allocation, buffer, cluster, innodb, Leak, Memory, MySQL

[12 Feb 2019 9:54] setra user
Description:
Hello,

We have a problem with an innodb cluster, the primary node consume all memory of server and swap but global memory allocations is not this big.
the server has 12 Gb all the buffer do not get higher than 5Gb (4gb for the innodb buffer pool) but the server consume almost 15Gb of virtual memory all the time.
There is not much activity on the server.
We have check memory summary table and we have seen something very strange.
We have checked high alloc and current alloc value
for high alloc we have seen this :
 memory/temptable/physical_ram                                                  |            25 |      26214400 |      1048576.0000 |         80 |   890241024 |  11128012.8000 | 849.00 MiB                   |
| memory/sql/String::value                                                       |           754 |        575784 |          763.6393 |       3155 |   971005928 |    307767.3306 | 926.02 MiB                   |
| memory/innodb/row0sel                                                          |           456 |      10334456 |        22663.2807 |      53241 |  3464088640 |     65064.3046 | 3.23 GiB                     |
| memory/innodb/memory                                                           |         31987 |      50885344 |         1590.8133 |    1156445 |  4980149144 |      4306.4297 | 4.64 GiB                     |
| memory/innodb/buf_buf_pool                                                     |            32 |    4397727744 |    137428992.0000 |         56 |  7696023552 | 137428992.0000 | 7.17 GiB                     |
| memory/mysqld_openssl/openssl_malloc                                           |          6398 |       1197041 |          187.0961 |  173973498 | 94221025539 |       541.5826 | 87.75 GiB     

openssl used 87Gb of memory the server does not have this much.
and for current_alloc ther eis only 5Gb used all the time.
The view does not report correct memory and the server after an amount of time get 
2019-02-11T16:36:19.233088Z 32846065 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Cannot allocate memory to store payload of size 98380800.'
2019-02-11T16:36:19.286522Z 32846065 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error preparing the message for sending.'
2019-02-11T16:36:19.303365Z 32846065 [ERROR] [MY-011614] [Repl] Plugin group_replication reported: 'Error while broadcasting the transaction to the group on session 32846065'
2019-02-11T16:36:19.449019Z 32846065 [ERROR] [MY-010207] [Repl] Run function 'before_commit' in plugin 'group_replication' failed
We think it is a bug and we do not know how to resolve this because allocation is strange.
We have chart of memory usage and we can see memory growing after restart to almost the maximum moery of the server.

How to repeat:
We do not know we have only one cluster with this problem

Suggested fix:
We haven't this one
[12 Feb 2019 21:22] MySQL Verification Team
Hi,
I do not have enough data to reproduce the problem.
When you close all connections (not restart mysql) does the memory gets released.

thanks
Bogdan
[13 Feb 2019 9:16] setra user
Hello,

Even when there are no more people connected, the use of RAM remains strongly elevated(just a few percent less)

Now i have a new error on the master node:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
17:00:30 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=140
max_threads=151
thread_count=22
connection_count=18
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 67846 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f9778a540e0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f9784dfedc0 thread_stack 0x46000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char*, unsigned long)+0x2e) [0x5614967f1ade]
/usr/sbin/mysqld(handle_fatal_signal+0x4c1) [0x561495a63171]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x110c0) [0x7f98d6dbc0c0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcf) [0x7f98d5051fff]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a) [0x7f98d505342a]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7f98d596a0ad]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8f066) [0x7f98d5968066]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8f0b1) [0x7f98d59680b1]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8f2c9) [0x7f98d59682c9]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8f7ec) [0x7f98d59687ec]
/usr/lib/mysql/plugin/group_replication.so(void std::vector<unsigned char, std::allocator<unsigned char> >::_M_range_insert<unsigned char const*>(__gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > >, unsigned char const*, unsigned char const*, std::forward_iterator_tag)+0x216) [0x7f96a99c5006]
/usr/lib/mysql/plugin/group_replication.so(Transaction_message::write(unsigned char const*, unsigned long long)+0x21) [0x7f96a99e8e31]
/usr/lib/mysql/plugin/group_replication.so(group_replication_trans_before_commit(Trans_param*)+0xb91) [0x7f96a99e99e1]
/usr/sbin/mysqld(Trans_delegate::before_commit(THD*, bool, Binlog_cache_storage*, Binlog_cache_storage*, unsigned long long, bool)+0x169) [0x5614958c4059]
/usr/sbin/mysqld(MYSQL_BIN_LOG::commit(THD*, bool)+0x69c) [0x561496529b0c]
/usr/sbin/mysqld(ha_commit_trans(THD*, bool, bool)+0x5ed) [0x561495b56f5d]
/usr/sbin/mysqld(trans_commit_stmt(THD*, bool)+0x2d) [0x561495a32c2d]
/usr/sbin/mysqld(mysql_execute_command(THD*, bool)+0x2893) [0x5614959522a3]
/usr/sbin/mysqld(mysql_parse(THD*, Parser_state*, bool)+0x3e0) [0x561495955b80]
/usr/sbin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x2e08) [0x561495958db8]
/usr/sbin/mysqld(do_command(THD*)+0x180) [0x5614959597d0]
/usr/sbin/mysqld(+0xe113e8) [0x561495a563e8]
/usr/sbin/mysqld(+0x204da2f) [0x561496c92a2f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7494) [0x7f98d6db2494]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f98d5107acf]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f97780aeeb0): UPDATE user SET updateCapgieDone = 0
Connection ID (thread ID): 33976292
Status: NOT_KILLED
[21 Feb 2019 0:36] MySQL Verification Team
Hi,
With regards to memory leak, I cannot reproduce this.

As for the crash you mention, please open another bug with details on how you got there.

kind regards
Bogdan
[3 Mar 2020 7:26] Joshua Thompson
I am getting something very similar:

2020-03-03T06:47:29.560872Z 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address QYVI16PRD04:3306 has become unreachable.'
2020-03-03T06:47:35.152789Z 0 [ERROR] [MY-011505] [Repl] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2020-03-03T06:47:35.153951Z 0 [Warning] [MY-011630] [Repl] Plugin group_replication reported: 'Due to a plugin error, some transactions were unable to be certified and will now rollback.'
2020-03-03T06:47:35.154825Z 0 [ERROR] [MY-011712] [Repl] Plugin group_replication reported: 'The server was automatically set into read only mode after an error was detected.'
2020-03-03T06:47:35.154846Z 1258 [ERROR] [MY-011615] [Repl] Plugin group_replication reported: 'Error while waiting for conflict detection procedure to finish on session 1258'
2020-03-03T06:47:35.156054Z 1257 [ERROR] [MY-011615] [Repl] Plugin group_replication reported: 'Error while waiting for conflict detection procedure to finish on session 1257'
2020-03-03T06:47:35.156972Z 1258 [ERROR] [MY-010207] [Repl] Run function 'before_commit' in plugin 'group_replication' failed
2020-03-03T06:47:35.168340Z 1257 [ERROR] [MY-010207] [Repl] Run function 'before_commit' in plugin 'group_replication' failed
[3 Mar 2020 19:10] MySQL Verification Team
Hi Joshua, 

I don't think this is related. 

all best
Bogdan