Bug #87748 terminate called after throwing an instance of 'std::bad_alloc'
Submitted: 13 Sep 2017 9:49 Modified: 21 Feb 2:47
Reporter: Simon Mudd (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Performance Schema Severity:S1 (Critical)
Version:8.0.1-dmr OS:CentOS (CentOS 7)
Assigned to: CPU Architecture:Any
Tags: 8.0.1, crash, dmr

[13 Sep 2017 9:49] Simon Mudd
Description:
I found a server running 8.0.1-dmr with this errors in the which broke.

The server was running:

[myuser@myhost ~]$ rpm -q mysql-community-server
mysql-community-server-8.0.1-0.1.dmr.el7.x86_64

This may be something which has been seen as this server has been running I think for some time. The rpm was installed on this server since May.

I'm uploading the log file in case it's of any use. One thing I do note is the server ran out of disk space. This is a slave with no binlogs. Maybe that's an important bit of information?

How to repeat:
not sure I can. this box had been replicating for some time.

Suggested fix:
Don't crash. (I understand this is a DMR version so crashes are more likely)
[13 Sep 2017 9:51] Simon Mudd
error-log showing the crashed recovery attempt which leads to another crash

Attachment: mysqld.log-20170912-sanitised (application/octet-stream, text), 85.97 KiB.

[13 Sep 2017 15:05] Sinisa Milivojevic
Hi Simon!

It is quite clear to me that the query that produces the exception is:

 SELECT * FROM performance_schema.data_locks;

First of all, let me tell you that I am quite eager to verify this bug as it looks that exception does not have to arise only from lack of memory, but also of the memory fragmentation. At least, we can catch() that exception and quit gracefully .....

So, please let me know how much RAM do you have allocated for MySQL server on that machine. Next, can you periodically run:

 SELECT count(*) FROM performance_schema.data_locks;

so that we get the idea of the magnitude of of the vector's size .....

Thank you very much in advance ......
[18 Sep 2017 7:53] Simon Mudd
Hi.

The included log file shows the size of the buffer pool:

2017-09-11T16:06:54.027706Z 1 [Note] InnoDB: Initializing buffer pool, total size = 99.875G, instances = 47, chunk size = 128M

The server has 128 GB of RAM.

You also asked for this:

== snip ===
root@myserver [(none)]> SELECT count(*) FROM performance_schema.data_locks;
ERROR 1146 (42S02): Table 'performance_schema.data_locks' doesn't exist
root@myserver [(none)]> select @@version;
+--------------+
| @@version    |
+--------------+
| 8.0.3-rc-log |
+--------------+
1 row in set (0.00 sec)
== snip ===

So I'm missing something or 8.0.3 is not running properly.
[18 Sep 2017 7:55] Simon Mudd
Hmm. Sorry for pasting information from the wrong server.

The 8.0.1 box I had has been thrown away as I'm now testing 8.0.3 so I can't provide you with the output from that server any more. I was simply reporting the issue I saw in case it was of use.

That said the server this bug was reported for 8.0.1 also had 128GB of RAM.
[18 Sep 2017 8:30] Simon Mudd
If relevant I do have the server's configuration recorded if that's of interest.
[2 Oct 2017 11:23] Sinisa Milivojevic
Hi,

First of all, sorry for the late reply as I was away on vacation ....

It would be nice if you could repeat the bug on 8.0.3, but 8.0.1 is fine too.

We need the approximate amount of RAM available prior to the exception being thrown and the approximate number of rows in that P_S table. Then, we shall be happy to verify it.

Thank you very much in advance ....
[3 Nov 2017 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[27 Nov 2018 8:42] Daniël van Eeden
On 8.0.13:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
08:38:48 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=268435456
read_buffer_size=131072
max_used_connections=27
max_threads=10000
thread_count=26
connection_count=24
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 11892769 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f9bfc0130c0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fb86c0e6c70 thread_stack 0x46000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char*, unsigned long)+0x3d) [0x1aa273d]
/usr/sbin/mysqld(handle_fatal_signal+0x423) [0xcd7b83]
/lib64/libpthread.so.0(+0xf6d0) [0x7fb8d2a4d6d0]
/lib64/libc.so.6(gsignal+0x37) [0x7fb8d0d64277]
/lib64/libc.so.6(abort+0x148) [0x7fb8d0d65968]
/lib64/libstdc++.so.6(__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fb8d16737d5]
/lib64/libstdc++.so.6(+0x5e746) [0x7fb8d1671746]
/lib64/libstdc++.so.6(+0x5e773) [0x7fb8d1671773]
/lib64/libstdc++.so.6(+0x5e993) [0x7fb8d1671993]
/lib64/libstdc++.so.6(operator new(unsigned long)+0x7d) [0x7fb8d1671f2d]
/usr/sbin/mysqld(void std::vector<row_data_lock, std::allocator<row_data_lock> >::_M_realloc_insert<row_data_lock const&>(__gnu_cxx::__normal_iterator<row_data_lock*, std::vector<row_data_lock, std::allocator<row_data_lock> > >, row_data_lock const&)+0x64) [0x1fbffb4]
/usr/sbin/mysqld(PFS_data_lock_container::add_lock_row(char const*, unsigned long, char const*, unsigned long, unsigned long long, unsigned long long, unsigned long long, char const*, unsigned long, char const*, unsigned long, char const*, unsigned long, char const*, unsigned long, char const*, unsigned long, void const*, char const*, char const*, char const*, char const*)+0x298) [0x1fc03f8]
/usr/sbin/mysqld(Innodb_data_lock_iterator::scan_trx(PSI_server_data_lock_container*, bool, trx_t const*, bool, int, unsigned long, unsigned int, unsigned int, unsigned long)+0x422) [0x1bf6fd2]
/usr/sbin/mysqld(Innodb_data_lock_iterator::scan_trx_list(PSI_server_data_lock_container*, bool, bool, ut_list_base<trx_t, ut_list_node<trx_t> trx_t::*>*)+0xa1) [0x1bf7261]
/usr/sbin/mysqld(Innodb_data_lock_iterator::scan(PSI_server_data_lock_container*, bool)+0x17f) [0x1bf849f]
/usr/sbin/mysqld(table_data_locks::rnd_next()+0x63) [0x1f93bc3]
/usr/sbin/mysqld(ha_perfschema::rnd_next(unsigned char*)+0x4e) [0x1f3e24e]
/usr/sbin/mysqld(handler::ha_rnd_next(unsigned char*)+0x1ac) [0xdccfbc]
/usr/sbin/mysqld(rr_sequential(READ_RECORD*)+0x35) [0xb44715]
/usr/sbin/mysqld(sub_select(JOIN*, QEP_TAB*, bool)+0x276) [0xf75e06]
/usr/sbin/mysqld(JOIN::exec()+0x410) [0xf74c00]
/usr/sbin/mysqld(Sql_cmd_dml::execute_inner(THD*)+0x104) [0xc117b4]
/usr/sbin/mysqld(Sql_cmd_dml::execute(THD*)+0x147) [0xc19257]
/usr/sbin/mysqld(mysql_execute_command(THD*, bool)+0x3c68) [0xbce248]
/usr/sbin/mysqld(mysql_parse(THD*, Parser_state*)+0x33b) [0xbd047b]
/usr/sbin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x2cf7) [0xbd33d7]
/usr/sbin/mysqld(do_command(THD*)+0x1a1) [0xbd3ed1]
/usr/sbin/mysqld() [0xcca8b0]
/usr/sbin/mysqld() [0x1f42a3f]
/lib64/libpthread.so.0(+0x7e25) [0x7fb8d2a45e25]
/lib64/libc.so.6(clone+0x6d) [0x7fb8d0e2cbad]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f9bfc019988): is an invalid pointer
Connection ID (thread ID): 16952
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
[27 Nov 2018 9:28] Daniël van Eeden
My previous comment had the wrong version nr.
This was on 8.0.12. I haven't seen it on 8.0.13 yet.
[27 Nov 2018 13:37] Sinisa Milivojevic
Hi Daniel,

Getting this error on 8.0.12 is quite OK. There is no need to repeat the crash on 8.0.13.

Simply, you are running out of the available memory, not because there is no sufficient virtual memory available. It is much more likely that the memory on the OS has become very fragmented. That is quite possible occurrence with C, but with C++ and STL that is much more probable outcome.

I have examined and analysed entire code that is related to the method that is using in collecting InnoDB row locks for the performance_schema. I also examined the method itself. The conclusion is that the STL exception std::bad_alloc is not checked for anywhere in the relevant code. I did discover that the exception is handled very well in many other modules of the 8.0 server code, like in some plugins, Temporary Table storage, in some places used by optimiser and executioner, but not in this particular part of code.

To make myself clear, there is not much that we can do when this exception occurs. Server simply has to be stopped. However, since InnoDB is a default SE in 8.0, we should be consistent and exit the program in the same manner in which InnoDB SE does, so that user / customer knows what happened and what to do.

That is, at least, my personal opinion and it has yet to be discussed upon. Once again, this problem can not be prevented, but the handling of out-of-the-memory error has to be consistent throughout our code.

So far, this is verified as reported.

.
[28 Nov 2018 12:14] Simon Mudd
Note: approximate amount of RAM on the server was probably 128 GB. The "free RAM", not used by mysqld and other system processes varies but we try to keep around 10-20 GB of free memory thought that often depends on exact workloads.

I don't have the specifics of this particular server now.
[29 Nov 2018 8:36] Daniël van Eeden
Handling the exception and printing out a clear error message would be a good fix this bug
[3 Dec 2018 16:30] Sinisa Milivojevic
Simon, Daniel,

Thank you very much for the additional info.
[12 Dec 2018 12:11] Shane Bester
On a generic DML testcase,  this does indeed crash.  I do not have a reliable testcase but I've seen it happen a couple of times...

mysqld.exe: Version: '8.0.15-tr'  Built on 2018/12/12.

11:51:48 UTC - mysqld got exception 0xc0000005 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x3706f3c0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
140afc5b5    mysqld.exe!my_convert()[ctype.cc:939]
1406d889c    mysqld.exe!innobase_raw_format()[ha_innodb.cc:2334]
1408f6825    mysqld.exe!row_raw_format()[row0row.cc:1185]
1408d65e5    mysqld.exe!put_nth_field()[trx0i_s.cc:609]
1408d6378    mysqld.exe!p_s_fill_lock_data()[trx0i_s.cc:684]
140898e89    mysqld.exe!Innodb_data_lock_iterator::scan_trx()[p_s.cc:826]
1408994ee    mysqld.exe!Innodb_data_lock_iterator::scan_trx_list()[p_s.cc:698]
140898754    mysqld.exe!Innodb_data_lock_iterator::scan()[p_s.cc:609]
1409caa99    mysqld.exe!table_data_locks::rnd_next()[table_data_locks.cc:172]
14099fee6    mysqld.exe!ha_perfschema::rnd_next()[ha_perfschema.cc:1614]
13f8c164a    mysqld.exe!handler::ha_rnd_next()[handler.cc:2826]
13fb84481    mysqld.exe!TableScanIterator::Read()[records.cc:362]
13fc7b386    mysqld.exe!sub_select()[sql_executor.cc:1531]
13fc74737    mysqld.exe!do_select()[sql_executor.cc:1162]
13fc76a5d    mysqld.exe!JOIN::exec()[sql_executor.cc:282]
13fb707e0    mysqld.exe!Sql_cmd_dml::execute_inner()[sql_select.cc:784]
13fb70510    mysqld.exe!Sql_cmd_dml::execute()[sql_select.cc:628]
13fa43219    mysqld.exe!mysql_execute_command()[sql_parse.cc:4468]
13fa43cd1    mysqld.exe!mysql_parse()[sql_parse.cc:5153]
13fa3d799    mysqld.exe!dispatch_command()[sql_parse.cc:1720]
13fa3e747    mysqld.exe!do_command()[sql_parse.cc:1264]
13f8b4068    mysqld.exe!handle_connection()[connection_handler_per_thread.cc:302]
1409966c7    mysqld.exe!pfs_spawn_thread()[pfs.cc:2839]
1405fff3c    mysqld.exe!win_thread_start()[my_thread.cc:52]
7fef21fcd70    ucrtbase.DLL!_o__realloc_base()
774759cd    kernel32.dll!BaseThreadInitThunk()
775d385d    ntdll.dll!RtlUserThreadStart()

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (375c2158): select * from performance_schema.data_locks
Connection ID (thread ID): 41
Status: NOT_KILLED
[12 Dec 2018 14:22] Sinisa Milivojevic
Hi,

Thank you Shane !!!!

Due to your analysis I am setting severity to the highest level available.
[12 Dec 2018 16:49] Shane Bester
Now I saw this on debug build...

mysqld-debug.exe: Version: '8.0.15-tr-debug'  Built on 2018/12/12.

[ERROR] [MY-013183] [InnoDB] Assertion failure: p_s.cc:721 thread 6072
141d4a5d5    mysqld-debug.exe!my_sigabrt_handler()[my_thr_init.cc:372]
7feee2ba381    ucrtbased.dll!raise()
7feee2bbf59    ucrtbased.dll!abort()
1422bbaba    mysqld-debug.exe!ut_dbg_assertion_failed()[ut0dbg.cc:92]
1424a929a    mysqld-debug.exe!Innodb_data_lock_iterator::scan_trx()[p_s.cc:721]
1424a8fba    mysqld-debug.exe!Innodb_data_lock_iterator::scan_trx_list()[p_s.cc:698]
1424a8bf8    mysqld-debug.exe!Innodb_data_lock_iterator::scan()[p_s.cc:613]
14282aded    mysqld-debug.exe!table_data_locks::rnd_next()[table_data_locks.cc:172]
1427b3552    mysqld-debug.exe!ha_perfschema::rnd_next()[ha_perfschema.cc:1613]
13f906a4a    mysqld-debug.exe!handler::ha_rnd_next()[handler.cc:2826]
13ff1d005    mysqld-debug.exe!TableScanIterator::Read()[records.cc:362]
14023ce88    mysqld-debug.exe!sub_select()[sql_executor.cc:1527]
1402475b5    mysqld-debug.exe!do_select()[sql_executor.cc:1162]
1402451cf    mysqld-debug.exe!JOIN::exec()[sql_executor.cc:282]
13ff95c6e    mysqld-debug.exe!Sql_cmd_dml::execute_inner()[sql_select.cc:784]
13ff9535f    mysqld-debug.exe!Sql_cmd_dml::execute()[sql_select.cc:625]
13fc5e59b    mysqld-debug.exe!mysql_execute_command()[sql_parse.cc:4352]
13fc57955    mysqld-debug.exe!mysql_parse()[sql_parse.cc:5151]
13fc621b6    mysqld-debug.exe!dispatch_command()[sql_parse.cc:1720]
13fc60086    mysqld-debug.exe!do_command()[sql_parse.cc:1263]
13f7fc023    mysqld-debug.exe!handle_connection()[connection_handler_per_thread.cc:302]
14279e625    mysqld-debug.exe!pfs_spawn_thread()[pfs.cc:2838]
141d492e7    mysqld-debug.exe!win_thread_start()[my_thread.cc:51]
7feee2c3428    ucrtbased.dll!_register_onexit_function()
7feee2c3071    ucrtbased.dll!_register_onexit_function()
774759cd    kernel32.dll!BaseThreadInitThunk()
775d385d    ntdll.dll!RtlUserThreadStart()

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (241f4bc8): select thread_id,sum(length(engine_lock_id)) from performance_schema.data_locks group by thread_id
Connection ID (thread ID): 9
[21 Feb 2:47] Paul Dubois
Posted by developer:
 
Fixed in 8.0.16.

Reading rows from the Performance Schema data_locks or
data_lock_waits table could return an unbounded number of rows during
a scan, resulting in a server exit due to excessive memory
allocation. This situation now produces an error.