Bug #97387 mysql innodb cluster crash, mysql restart failed
Submitted: 26 Oct 2019 1:25 Modified: 30 May 2021 9:38
Reporter: black 无 Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:8.0.17 OS:CentOS (7.7)
Assigned to: CPU Architecture:Any

[26 Oct 2019 1:25] black 无
Description:
3 nodes, 2 at the same time crash, and the restart failed

InnoDB: Progress in percents: 110:07:32 UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x7f3ad8a95980
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f453fcfeba0 thread_stack 0x46000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x3d) [0x1f104dd]
/usr/sbin/mysqld(handle_fatal_signal+0x333) [0xf8b933]
/lib64/libpthread.so.0(+0xf5f0) [0x7f4b91ada5f0]
/usr/sbin/mysqld(dtuple_convert_big_rec(dict_index_t*, upd_t*, dtuple_t*, unsigned long*)+0xb62) [0x2279e02]
/usr/sbin/mysqld(btr_cur_pessimistic_update(unsigned long, btr_cur_t*, unsigned long**, mem_block_info_t**, mem_block_info_t*, big_rec_t**, upd_t*, unsigned long, que_thr_t*, unsigned long, unsigned long, mtr_t*)+0x2ef) [0x21fc59f]
/usr/sbin/mysqld() [0x236f926]
/usr/sbin/mysqld() [0x2370047]
/usr/sbin/mysqld(row_undo_mod(undo_node_t*, que_thr_t*)+0xcbf) [0x237331f]
/usr/sbin/mysqld(row_undo_step(que_thr_t*)+0x60) [0x213bb40]
/usr/sbin/mysqld(que_run_threads(que_thr_t*)+0x988) [0x20cbb08]
/usr/sbin/mysqld(trx_rollback_or_clean_recovered(unsigned long)+0xbba) [0x219affa]
/usr/sbin/mysqld(trx_recovery_rollback_thread()+0x30) [0x219c020]
/usr/sbin/mysqld(std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runnable, void (*)()> > >::_M_run()+0xb5) [0x2072995]
/usr/sbin/mysqld() [0x263cf3f]
/lib64/libpthread.so.0(+0x7e65) [0x7f4b91ad2e65]
/lib64/libc.so.6(clone+0x6d) [0x7f4b8fc5988d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): Connection ID (thread ID): 0
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

How to repeat:
unknown
[26 Oct 2019 12:05] MySQL Verification Team
Thank you for the bug report. We need a complete repeatable test case to process this bug report, please when you will able provide it here, also test the latest release.
[21 Feb 2021 23:12] Andrew Ernst
I experienced this same issue running MySQL 8.0.21 on CentOS 7.9 in a 3-node Group Replication cluster this past week.  As best as I can tell, a user performed a large and unbounded UPDATE statement, likely causing a spike in memory usage, and a number of on-disk temporary tables.

The Linux kernel OOM killed the mysqld process, and upon attempting to restart, crash recovery failed.

The mysqld service entered a crash loop.  I did attempt to use innodb_force_recovery=1  with no success (same behavior). 

This was the output of the startup, and I'll see if I can augment this report with a more reproducible scenario -- however this is the first time I've seen this behavior, and am not sure I can reproduce.

nnoDB: Progress in percents: 123:15:38 UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x7fc4a8e16cc0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fd2576febe0 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x3d) [0x213b8ad]
/usr/sbin/mysqld(handle_fatal_signal+0x313) [0xff5083]
/lib64/libpthread.so.0(+0xf630) [0x7fd7306db630]
/usr/sbin/mysqld(dtuple_convert_big_rec(dict_index_t*, upd_t*, dtuple_t*)+0xb35) [0x251f1e5]
/usr/sbin/mysqld(btr_cur_pessimistic_update(unsigned long, btr_cur_t*, unsigned long**, mem_block_info_t**, mem_block_info_t*, big_rec_t**, upd_t*, unsigned long, que_thr_t*, unsigned long, unsigned long, mtr_t*, btr_pcur_t*)+0x6b1) [0x2491aa1]
/usr/sbin/mysqld() [0x26315d9]
/usr/sbin/mysqld() [0x2631c50]
/usr/sbin/mysqld(row_undo_mod(undo_node_t*, que_thr_t*)+0xdbf) [0x26352ef]
/usr/sbin/mysqld(row_undo_step(que_thr_t*)+0x59) [0x23b8fc9]
/usr/sbin/mysqld(que_run_threads(que_thr_t*)+0x970) [0x233db20]
/usr/sbin/mysqld(trx_rollback_or_clean_recovered(unsigned long)+0xf87) [0x2425717]
/usr/sbin/mysqld(trx_recovery_rollback_thread()+0x31) [0x2426561]
/usr/sbin/mysqld(std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runnable, void (*)()> > >::_M_run()+0xb5) [0x22da985]
/usr/sbin/mysqld() [0x2958130]
/lib64/libpthread.so.0(+0x7ea5) [0x7fd7306d3ea5]
/lib64/libc.so.6(clone+0x6d) [0x7fd72eab69fd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): Connection ID (thread ID): 0
Status: NOT_KILLED
[21 Feb 2021 23:15] Andrew Ernst
I should also point out that in this 3-node cluster -- much like the original reporter of the ticket, two of my hosts ended up in the same state, while the third server just reported that it was still holding a SECONDARY role in a cluster where one node had been evicted and the primary node was in UNKNOWN status.
[18 May 2021 0:39] lou shuai
hi, @black 无

can you give me the download link for version 8.0.17 you are using?
linux generic or redhat rpm?
[18 May 2021 3:03] lou shuai
hi @Andrew Ernst,

you are using the community version or commercial version?
If community version, which OS version are you using? linux generic or redhat 7 version? It's better give me the exact download link for me to analyze.

Thanks!
[18 May 2021 3:03] lou shuai
hi @Andrew Ernst,

you are using the community version or commercial version?
If community version, which OS version are you using? linux generic or redhat 7 version? It's better give me the exact download link for me to analyze.

Thanks!
[30 May 2021 9:38] black 无
I use "mysql80-community-release-el7-3.noarch.rpm"
yum install mysql-community-server