Bug #86595 MySQL Instance sudden restart
Submitted: 6 Jun 2017 3:37 Modified: 12 Oct 2017 15:09
Reporter: Sik Yin Wong Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:5.7.16 OS:Red Hat (RHEL6.8)
Assigned to: CPU Architecture:Any
Tags: failure in thread, innodb

[6 Jun 2017 3:37] Sik Yin Wong
Description:
MySQL Instance sudden restart with below error logging

2017-05-24 00:07:15 0x7f6bbac17700  InnoDB: Assertion failure in thread 140100671469312 in file btr0btr.cc line 2157
InnoDB: Failing assertion: err == DB_SUCCESS
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
/u01/app/mnotintp1/usr/sbin/mysqld(my_print_stacktrace+0x35)[0xf40b15]
/u01/app/mnotintp1/usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x7cdf24]
/lib64/libpthread.so.0[0x3b0b00f7e0]
/lib64/libc.so.6(gsignal+0x35)[0x3b0a8325e5]
/lib64/libc.so.6(abort+0x175)[0x3b0a833dc5]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z18ut_print_timestampP8_IO_FILE+0x0)[0x7bccf4]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z33btr_insert_on_non_leaf_level_funcmP12dict_index_tmP8dtuple_tPKcmP5mtr_t+0x34b)[0x113560b]
/u01/app/mnotintp1/usr/sbin/mysqld[0x113b5f3]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z25btr_page_split_and_insertmP9btr_cur_tPPmPP16mem_block_info_tPK8dtuple_tmP5mtr_t+0x335)[0x113c185]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z26btr_cur_pessimistic_insertmP9btr_cur_tPPmPP16mem_block_info_tP8dtuple_tPPhPP9big_rec_tmP9que_thr_tP5mtr_t+0x298)[0x1146948]
/u01/app/mnotintp1/usr/sbin/mysqld[0x107cd2a]
/u01/app/mnotintp1/usr/sbin/mysqld[0x109327b]
/u01/app/mnotintp1/usr/sbin/mysqld(_ZN11ha_innobase25intrinsic_table_write_rowEPh+0x3d)[0xfbdcad]
/u01/app/mnotintp1/usr/sbin/mysqld(_ZN11ha_innobase9write_rowEPh+0x1d5)[0xfcda75]
/u01/app/mnotintp1/usr/sbin/mysqld(_ZN7handler12ha_write_rowEPh+0x1a7)[0x81bde7]
/u01/app/mnotintp1/usr/sbin/mysqld[0xcd9c4e]
/u01/app/mnotintp1/usr/sbin/mysqld[0xcddc2e]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z10sub_selectP4JOINP7QEP_TABb+0x323)[0xcdf2c3]
/u01/app/mnotintp1/usr/sbin/mysqld[0xcddc2e]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z10sub_selectP4JOINP7QEP_TABb+0x323)[0xcdf2c3]
/u01/app/mnotintp1/usr/sbin/mysqld(_ZN10JOIN_CACHE24generate_full_extensionsEPh+0x68)[0xe96698]
/u01/app/mnotintp1/usr/sbin/mysqld(_ZN14JOIN_CACHE_BNL21join_matching_recordsEb+0xfb)[0xe96ceb]
/u01/app/mnotintp1/usr/sbin/mysqld(_ZN10JOIN_CACHE12join_recordsEb+0x236)[0xe97036]
/u01/app/mnotintp1/usr/sbin/mysqld[0xcddc2e]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z10sub_selectP4JOINP7QEP_TABb+0x323)[0xcdf2c3]
/u01/app/mnotintp1/usr/sbin/mysqld[0xcddc2e]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z10sub_selectP4JOINP7QEP_TABb+0x323)[0xcdf2c3]
/u01/app/mnotintp1/usr/sbin/mysqld(_ZN4JOIN4execEv+0x27a)[0xcde1ea]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z12handle_queryP3THDP3LEXP12Query_resultyy+0x250)[0xd48630]
/u01/app/mnotintp1/usr/sbin/mysqld[0xd09553]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x36f8)[0xd0cf18]
/u01/app/mnotintp1/usr/lib64/mysql/plugin/libaudit_plugin.so(+0xfa51)[0x7f6bc1b8ba51]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z11mysql_parseP3THDP12Parser_state+0x3ed)[0xd0eccd]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0x111e)[0xd0fe4e]
/u01/app/mnotintp1/usr/sbin/mysqld(_Z10do_commandP3THD+0x194)[0xd10ab4]
/u01/app/mnotintp1/usr/sbin/mysqld(handle_connection+0x29c)[0xde2e0c]
/u01/app/mnotintp1/usr/sbin/mysqld(pfs_spawn_thread+0x174)[0x1252bb4]
/lib64/libpthread.so.0[0x3b0b007aa1]
/lib64/libc.so.6(clone+0x6d)[0x3b0a8e8aad]

How to repeat:
<none>
[20 Jun 2017 16:58] MySQL Verification Team
Hi!

In order to verify the bug, we need the entire test case. We need the test case that always lead to the assert that InnoDB executed.

Test cases are required, because, on most commodity hardware, high load leads to the temporary glitches in RAM, cache, controller or disk or their caches. More rarely, those asserts occur due to some inherent hardware failure. So, in order to be able to verify that this is a genuine bug, please, send us a set of commands that always lead to this assert. We can not proceed further without it.

Thank you very much in advance.
[21 Jul 2017 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[6 Sep 2017 8:51] Sik Yin Wong
Due to company policy, it is infeasible to provide a test plan to repeat the bug. However, this case happens frequently in the previous month. Is there any information I can provide instead of the test plan, such as some kind of log after the database restart or enable some audit function etc. Thanks.
[6 Sep 2017 9:19] MySQL Verification Team
Here is the stack trace,  pretty printed.

0x0000000000f40b15: my_print_stacktrace at stacktrace.c:225
0x00000000007cdf24: handle_fatal_signal at signal_handler.cc:150
0x00000000007bccf4: ut_print_timestamp(_IO_FILE*) at ut0ut.cc:215
0x000000000113560b: btr_insert_on_non_leaf_level_func(unsigned long, dict_index_t*, unsigned long, dtuple_t*, char const*, unsigned long, mtr_t*) at btr0btr.cc:2167
0x000000000113b5f3: mem_heap_free at mem0mem.ic:528
 (inlined by) btr_attach_half_pages at btr0btr.cc:2280
0x000000000113c185: btr_page_split_and_insert(unsigned long, btr_cur_t*, unsigned long**, mem_block_info_t**, dtuple_t const*, unsigned long, mtr_t*) at btr0btr.cc:2684
0x0000000001146948: btr_cur_pessimistic_insert(unsigned long, btr_cur_t*, unsigned long**, mem_block_info_t**, dtuple_t*, unsigned char**, big_rec_t**, unsigned long, que_thr_t*, mtr_t*) at btr0cur.cc:3428
0x000000000107cd2a: row_ins_sorted_clust_index_entry at row0ins.cc:2744
0x000000000109327b: row_insert_for_mysql_using_cursor at row0mysql.cc:1597
0x0000000000fbdcad: ha_innobase::intrinsic_table_write_row(unsigned char*) at ha_innodb.cc:7319
0x0000000000fcda75: ha_innobase::write_row(unsigned char*) at ha_innodb.cc:7340
0x000000000081bde7: handler::ha_write_row(unsigned char*) at handler.cc:7890
0x0000000000cd9c4e: end_write at sql_executor.cc:3365
0x0000000000cddc2e: THD::get_stmt_da() at sql_class.h:3381
 (inlined by) evaluate_join_record at sql_executor.cc:1640
0x0000000000cdf2c3: sub_select(JOIN*, QEP_TAB*, bool) at sql_executor.cc:1291
0x0000000000cddc2e: THD::get_stmt_da() at sql_class.h:3381
 (inlined by) evaluate_join_record at sql_executor.cc:1640
0x0000000000cdf2c3: sub_select(JOIN*, QEP_TAB*, bool) at sql_executor.cc:1291
0x0000000000e96698: JOIN_CACHE::generate_full_extensions(unsigned char*) at sql_join_buffer.cc:2218
0x0000000000e96ceb: JOIN_CACHE_BNL::join_matching_records(bool) at sql_join_buffer.cc:2101
0x0000000000e97036: JOIN_CACHE::join_records(bool) at sql_join_buffer.cc:1905
0x0000000000cddc2e: THD::get_stmt_da() at sql_class.h:3381
 (inlined by) evaluate_join_record at sql_executor.cc:1640
0x0000000000cdf2c3: sub_select(JOIN*, QEP_TAB*, bool) at sql_executor.cc:1291
0x0000000000cddc2e: THD::get_stmt_da() at sql_class.h:3381
 (inlined by) evaluate_join_record at sql_executor.cc:1640
0x0000000000cdf2c3: sub_select(JOIN*, QEP_TAB*, bool) at sql_executor.cc:1291
0x0000000000cde1ea: do_select at sql_executor.cc:945
 (inlined by) JOIN::exec() at sql_executor.cc:199
0x0000000000d48630: st_select_lex_unit::set_executed() at sql_lex.h:662
 (inlined by) handle_query(THD*, LEX*, Query_result*, unsigned long long, unsigned long long) at sql_select.cc:185
0x0000000000d09553: execute_sqlcom_select at sql_parse.cc:5143
0x0000000000d0cf18: mysql_execute_command(THD*, bool) at sql_parse.cc:2756
0x0000000000d0eccd: mysql_parse(THD*, Parser_state*) at sql_parse.cc:5559
0x0000000000d0fe4e: Parser_state::reset(char const*, unsigned long) at sql_lex.h:3645
 (inlined by) dispatch_command(THD*, COM_DATA const*, enum_server_command) at sql_parse.cc:1506
0x0000000000d10ab4: do_command(THD*) at sql_parse.cc:997
0x0000000000de2e0c: handle_connection at connection_handler_per_thread.cc:300
0x0000000001252bb4: pfs_spawn_thread at pfs.cc:2191
[6 Sep 2017 9:34] MySQL Verification Team
This seems to happen when a SELECT uses innodb intrinsic table.

It seems the function btr_cur_pessimistic_insert returns an error of some kind, that is not expected.

btr_cur_pessimistic_insert might return errors:
  o) error from btr_cur_ins_lock_and_undo (DB_WAIT_LOCK for example).
  o) DB_OUT_OF_FILE_SPACE
  o) DB_TOO_BIG_RECORD

Anyway,  please try a workaround to prevent crash ::

[mysqld]
internal-tmp-disk-storage-engine=MyISAM
[12 Sep 2017 15:09] MySQL Verification Team
Hi!

First of all, we need a fully repeatable test case.

If the workaround works, it will help you, but it would not help us much.
[13 Oct 2017 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".