Bug #95993 after kill -9 of mysqld, than cannot to start, crash
Submitted: 26 Jun 2019 7:43 Modified: 30 Aug 2019 10:05
Reporter: wang huai Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:8.0.16 OS:Any
Assigned to: CPU Architecture:Any
Tags: crash

[26 Jun 2019 7:43] wang huai
Description:
after kill -9 of mysqld, than cannot to start, crash, the stack as follows:

Program received signal SIGSEGV, Segmentation fault.
0x0000000001a61671 in dd::get_dd_client (thd=thd@entry=0x0) at /data/mysql/sql/dd/impl/dictionary_impl.cc:536
536 cache::Dictionary_client *get_dd_client(THD *thd) { return thd->dd_client(); }
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.3.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_6.x86_64 libaio-0.3.109-13.el7.x86_64 libblkid-2.23.2-59.el7_6.1.x86_64 libcom_err-1.42.9-13.el7.x86_64 libgcc-4.8.5-36.el7_6.2.x86_64 libibverbs-17.2-3.el7.x86_64 libnl3-3.2.28-4.el7.x86_64 librados2-13.2.4-0.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libuuid-2.23.2-59.el7_6.1.x86_64 lttng-ust-2.4.1-4.el7.x86_64 nspr-4.19.0-1.el7_5.x86_64 nss-3.36.0-7.1.el7_6.x86_64 nss-softokn-3.36.0-5.el7_5.x86_64 nss-softokn-freebl-3.36.0-5.el7_5.x86_64 nss-util-3.36.0-1.1.el7_6.x86_64 openssl-libs-1.0.2k-16.el7.x86_64 pcre-8.32-17.el7.x86_64 sqlite-3.7.17-8.el7.x86_64 userspace-rcu-0.7.16-1.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) p thd
$1 = (THD *) 0x0
(gdb) bt
#0  0x0000000001a61671 in dd::get_dd_client (thd=thd@entry=0x0)
    at /data/mysql/sql/dd/impl/dictionary_impl.cc:536
#1  0x0000000001e5e878 in dd_table_open_on_id_low (thd=thd@entry=0x0, mdl=mdl@entry=0x0,
    table_id=table_id@entry=1107, this=<optimized out>, this=<optimized out>, this=<optimized out>)
    at /data/mysql/storage/innobase/dict/dict0dd.cc:416
#2  0x0000000001e5f727 in dd_table_open_on_id (table_id=1107, thd=<optimized out>, mdl=mdl@entry=0x0,
    dict_locked=dict_locked@entry=false, check_corruption=check_corruption@entry=true)
    at /data/mysql/storage/innobase/dict/dict0dd.cc:642
#3  0x00000000021e85c5 in row_undo_mod_parse_undo_rec (mdl=0x0, node=0xbd11c28)
    at /data/mysql/storage/innobase/row/row0umod.cc:1043
#4  row_undo_mod (node=node@entry=0xbd11c28, thr=thr@entry=0xbd0e978)
    at /data/mysql/storage/innobase/row/row0umod.cc:1105
#5  0x0000000002087d30 in row_undo (thr=0xbd0e978, node=0xbd11c28)
    at /data/mysql/storage/innobase/row/row0undo.cc:300
#6  row_undo_step (thr=thr@entry=0xbd0e978) at /data/mysql/storage/innobase/row/row0undo.cc:361
#7  0x0000000002013eb2 in que_thr_step (thr=0xbd0e978) at /data/mysql/storage/innobase/que/que0que.cc:925
#8  que_run_threads_low (thr=0xbd0e978) at /data/mysql/storage/innobase/que/que0que.cc:978
#9  que_run_threads (thr=<optimized out>) at /data/mysql/storage/innobase/que/que0que.cc:1013
#10 0x00000000020e9e3c in trx_rollback_to_savepoint_low (trx=trx@entry=0x7fffd416fe50, savept=savept@entry=0x0)
    at /data/mysql/storage/innobase/trx/trx0roll.cc:115
#11 0x00000000020ea501 in trx_rollback_for_mysql_low (trx=0x7fffd416fe50)
    at /data/mysql/storage/innobase/trx/trx0roll.cc:169
#12 0x00000000020ea85c in trx_rollback_low (trx=trx@entry=0x7fffd416fe50)
    at /data/mysql/storage/innobase/trx/trx0roll.cc:244
#13 0x00000000020eb0ee in trx_rollback_for_mysql (trx=trx@entry=0x7fffd416fe50)
    at /data/mysql/storage/innobase/trx/trx0roll.cc:266
#14 0x0000000001eec4e1 in innobase_rollback_trx (trx=trx@entry=0x7fffd416fe50)
    at /data/mysql/storage/innobase/handler/ha_innodb.cc:5607
#15 0x0000000001f0034f in innobase_rollback_by_xid (hton=<optimized out>, xid=<optimized out>)
    at /data/mysql/storage/innobase/handler/ha_innodb.cc:19106
#16 0x0000000000e9dbcd in xarecover_handlerton (plugin=<optimized out>, arg=0x7fffffffcda0)
    at /data/mysql/sql/xa.cc:358
#17 0x0000000000dc4b86 in plugin_foreach_with_mask (thd=thd@entry=0x0, funcs=funcs@entry=0x7fffffffcd60,
    type=type@entry=1, state_mask=4294967287, state_mask@entry=8, arg=arg@entry=0x7fffffffcda0)
    at /data/mysql/sql/sql_plugin.cc:2544
#18 0x0000000000dc4d0d in plugin_foreach_with_mask (thd=thd@entry=0x0,
    func=func@entry=0xe9d900 <xarecover_handlerton(THD*, plugin_ref, void*)>, type=type@entry=1,
    state_mask=state_mask@entry=8, arg=arg@entry=0x7fffffffcda0) at /data/mysql/sql/sql_plugin.cc:2557
#19 0x0000000000e9dfc5 in ha_recover (commit_list=commit_list@entry=0x7fffffffcf70)
    at /data/mysql/sql/xa.cc:411
#20 0x00000000019b1607 in binlog_recover (valid_pos=<synthetic pointer>, binlog_file_reader=0x7fffffffd000,
    this=<optimized out>, this=<optimized out>) at /data/mysql/sql/binlog.cc:9068
#21 MYSQL_BIN_LOG::open_binlog (this=<optimized out>, opt_name=<optimized out>)
    at /data/mysql/sql/binlog.cc:7683
#22 0x0000000000cb85b0 in init_server_components () at /data/mysql/sql/mysqld.cc:5579
#23 0x0000000000cc0af2 in mysqld_main (argc=76, argv=0x39abcf0) at /data/mysql/sql/mysqld.cc:6305
#24 0x00007ffff5d0a3d5 in __libc_start_main () from /lib64/libc.so.6
#25 0x0000000000ca1a25 in _start ()

the reason is current_thd is null when recover from binlog.

How to repeat:
kill -9 mysqld when update.

Suggested fix:
add  "Auto_THD thd;" before line :

"if (tc_log->open(opt_bin_log ? opt_bin_logname : opt_tc_log_file))"

then current_thd is not null, running ok
[26 Jun 2019 12:46] MySQL Verification Team
Hi Mr. Huai,

Thank you for your bug report.

This bug turned out to be reported internally already and was already fixed. It is yet unknown whether it is fixed in 8.0.17 or 8.0.18. We shall let you know when we find out.

Also, we do not know when any of those two releases will be published.

Thank you for your contribution.
[27 Jun 2019 12:48] MySQL Verification Team
Hi Mr. Huai,

We have made enquiries and discovered that a bug is fixed in 8.0.17.

We do not know when will 8.0.17 be fixed, but when that release is published, all our pages on dev.mysql.com will be updated.
[27 Aug 2019 14:13] Bernardo Perez
Hello,

Could it be possible that you confirm if this bug is related to this one https://bugs.mysql.com/bug.php?id=93768 ?

Best regards, 

We have a signature now that seems to be a bit of a mix of Bug #95993 and Bug #93768

stack:

2019-08-27T05:32:24.378216Z 26139 [ERROR] [MY-013183] [InnoDB] Assertion failure: row0umod.cc:149:!dummy_big_rec thread 47668352853760
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
05:32:24 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=16777216
read_buffer_size=262144
max_used_connections=24
max_threads=312
thread_count=8
connection_count=7
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 179579 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x2b5b195c3000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 2b5aa73d1d00 thread_stack 0x40000
/rdsdbbin/mysql/bin/mysqld(my_print_stacktrace(unsigned char*, unsigned long)+0x2e) [0x1c891ee]
/rdsdbbin/mysql/bin/mysqld(handle_fatal_signal+0x413) [0xecd013]
/lib64/libpthread.so.0(+0xf100) [0x2b5a78826100]
/lib64/libc.so.6(gsignal+0x37) [0x2b5a795a25f7]
/lib64/libc.so.6(abort+0x148) [0x2b5a795a3ce8]
/rdsdbbin/mysql/bin/mysqld(ut_dbg_assertion_failed(char const*, char const*, unsigned long)+0x2c1) [0x1fa5f61]
/rdsdbbin/mysql/bin/mysqld() [0x2134f44]
/rdsdbbin/mysql/bin/mysqld() [0x2135526]
/rdsdbbin/mysql/bin/mysqld(row_undo_mod(undo_node_t*, que_thr_t*)+0xbdf) [0x2138aaf]
/rdsdbbin/mysql/bin/mysqld(row_undo_step(que_thr_t*)+0x4f) [0x1f26f3f]
/rdsdbbin/mysql/bin/mysqld(que_run_threads(que_thr_t*)+0x94f) [0x1ec6b1f]
/rdsdbbin/mysql/bin/mysqld() [0x1f83edb]
/rdsdbbin/mysql/bin/mysqld(trx_rollback_to_savepoint(trx_t*, trx_savept_t*)+0x22) [0x1f84162]
/rdsdbbin/mysql/bin/mysqld(trx_rollback_last_sql_stat_for_mysql(trx_t*)+0x31) [0x1f841a1]
/rdsdbbin/mysql/bin/mysqld() [0x1df2a00]
/rdsdbbin/mysql/bin/mysqld(ha_rollback_low(THD*, bool)+0xe7) [0xfbbf97]
/rdsdbbin/mysql/bin/mysqld(MYSQL_BIN_LOG::rollback(THD*, bool)+0xe4) [0x19db3e4]
/rdsdbbin/mysql/bin/mysqld(ha_rollback_trans(THD*, bool)+0x11c) [0xfbc1cc]
/rdsdbbin/mysql/bin/mysqld(trans_rollback_stmt(THD*)+0x29) [0xe9cf19]
/rdsdbbin/mysql/bin/mysqld(mysql_execute_command(THD*, bool)+0x2c7d) [0xdc1a5d]
/rdsdbbin/mysql/bin/mysqld(mysql_parse(THD*, Parser_state*, bool)+0x33f) [0xdc370f]
/rdsdbbin/mysql/bin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x2c1b) [0xdc67ab]
/rdsdbbin/mysql/bin/mysqld(do_command(THD*)+0x179) [0xdc7289]
/rdsdbbin/mysql/bin/mysqld() [0xec00c8]
/rdsdbbin/mysql/bin/mysqld() [0x21617f7]
/lib64/libpthread.so.0(+0x7dc5) [0x2b5a7881edc5]
/lib64/libc.so.6(clone+0x6d) [0x2b5a79663c9d]
[27 Aug 2019 15:52] MySQL Verification Team
Mr. huai,

Those truly look identical !!!!
[29 Aug 2019 1:49] wang huai
I think these two bugs are not same, because Bug #95993 was happend in mysqld main thread, and current_thd of main thread is null, so it is crash, but another one is not the main thread, it is the user thread. 

I have show the method of fixed(where thd main thread's current_thd is null), so I think they are not same.
[29 Aug 2019 12:25] MySQL Verification Team
Hi Mr. huai,

Let us know if you can repeat the crash with 8.0.17 ....

Then, we can take a look at it ....
[30 Aug 2019 10:05] wang huai
when I fixed it, the mysqld working well, so the crash not happend in the future, but as I known, we can infer and repeat it  from the crash running stack, and it is confirmed that, in the start phrase, the current_thd is null, so if used it(binlog_recover), it will certain crash.
[30 Aug 2019 12:40] MySQL Verification Team
Hi Mr. huai,

First of all, I did not understand much of what you wrote.

Second, this is a user thread, not main thread. Main thread only spawns user threads and does not execute any statements.

Third, when `current_thd` is NULL it is mostly the case when it is optimised away, id est, it is found in one of the registers.

Last, but not least, analysis of the original bug clearly shows that this one is a duplicate.
[30 Aug 2019 12:42] MySQL Verification Team
One more comment .....

A developer who fixed the original bug in 8.0.17 is very adamant that this one is a duplicate.

Hence, try 8.0.17.
[30 Aug 2019 13:39] Bernardo Perez
Hello Sinisa,

Bernardo Perez here. I made the comment of the previous stack and the other bug that I thought It could be related.

Thanks for confirming that they are duplicates and that my signature is the same as this bug. 

Regards,
[30 Aug 2019 13:42] MySQL Verification Team
Mr. Perez,

You are welcome .......