Bug #88313 MySQL Often crash and restart immediately
Submitted: 1 Nov 2017 9:33 Modified: 16 Nov 2017 9:29
Reporter: hao chen Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.6.37 OS:CentOS (CentOS release 6.6 (Final))
Assigned to: CPU Architecture:Any
Tags: crash

[1 Nov 2017 9:33] hao chen
Description:
I install MySQL-5.6.37 two month ago
I use mysql-5.6.37-linux-glibc2.12-x86_64.tar.gz
in recent times, There is no omen of some MySQL crash. 
system haven't crash message(/var/log/message).
follow message in MySQL error.log

2017-11-01 02:13:28 7f97a2862700  InnoDB: Assertion failure in thread 140289243490048 in file btr0pcur.cc line 432
InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == buf_block_get_page_no(btr_pcur_get_block(cursor))
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
18:13:28 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

key_buffer_size=268435456
read_buffer_size=1048576
max_used_connections=2569
max_threads=3000
thread_count=69
connection_count=69
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6444534 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7ab41d60
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f97a2861e18 thread_stack 0x40000
/usr/bin/mysqld(my_print_stacktrace+0x35)[0x8d6f55]
/usr/bin/mysqld(handle_fatal_signal+0x494)[0x664af4]
/lib64/libpthread.so.0[0x3540c0f710]
/lib64/libc.so.6(gsignal+0x35)[0x3540832625]
/lib64/libc.so.6(abort+0x175)[0x3540833e05]
/usr/bin/mysqld[0x9f1943]
/usr/bin/mysqld[0x994ce4]
/usr/bin/mysqld[0x997c25]
/usr/bin/mysqld[0x8f7c5c]
/usr/bin/mysqld(_ZN7handler13ha_index_nextEPh+0x66)[0x5a6aa6]
/usr/bin/mysqld(_ZN7handler15read_range_nextEv+0x29)[0x5a9169]
/usr/bin/mysqld(_ZN7handler21multi_range_read_nextEPPc+0xb2)[0x5a7f32]
/usr/bin/mysqld(_ZN10DsMrr_impl10dsmrr_nextEPPc+0x128)[0x5a9678]
/usr/bin/mysqld(_ZN18QUICK_RANGE_SELECT8get_nextEv+0x52)[0x7ef562]
/usr/bin/mysqld[0x8100eb]
/usr/bin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x179)[0x6c0339]
/usr/bin/mysqld(_ZN4JOIN4execEv+0x3da)[0x6bf39a]
/usr/bin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_P10SQL_I_ListI8st_orderESB_S7_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x250)[0x702760]
/usr/bin/mysqld(_Z13handle_selectP3THDP13select_resultm+0x19f)[0x702fff]
/usr/bin/mysqld[0x6de36d]
/usr/bin/mysqld(_Z21mysql_execute_commandP3THD+0x1952)[0x6e1152]
/usr/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x378)[0x6e4598]
/usr/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x110e)[0x6e574e]
/usr/bin/mysqld(_Z24do_handle_one_connectionP3THD+0xcf)[0x6b1d4f]
/usr/bin/mysqld(handle_one_connection+0x47)[0x6b1e77]
/usr/bin/mysqld(pfs_spawn_thread+0x12d)[0xb0921d]
/lib64/libpthread.so.0[0x3540c079d1]
/lib64/libc.so.6(clone+0x6d)[0x35408e89dd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f9226516730): is an invalid pointer
Connection ID (thread ID): 11945073
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
2017-11-01 02:13:29 54624 [Warning] Using unique option prefix myisam_recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
2017-11-01 02:13:29 54624 [Note] Plugin 'FEDERATED' is disabled.
2017-11-01 02:13:29 7ff825cde720 InnoDB: Warning: Using innodb_additional_mem_pool_size is DEPRECATED. This option may be removed in future releases, together with the option innodb_use_sys_malloc and with the InnoDB's internal memory allocator.
2017-11-01 02:13:29 54624 [Note] InnoDB: Using atomics to ref count buffer pool pages
2017-11-01 02:13:29 54624 [Note] InnoDB: The InnoDB memory heap is disabled
2017-11-01 02:13:29 54624 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-11-01 02:13:29 54624 [Note] InnoDB: Memory barrier is not used
2017-11-01 02:13:29 54624 [Note] InnoDB: Compressed tables use zlib 1.2.3
2017-11-01 02:13:29 54624 [Note] InnoDB: Using Linux native AIO
2017-11-01 02:13:29 54624 [Note] InnoDB: Using CPU crc32 instructions
2017-11-01 02:13:29 54624 [Note] InnoDB: Initializing buffer pool, size = 70.0G
2017-11-01 02:13:32 54624 [Note] InnoDB: Completed initialization of buffer pool
2017-11-01 02:13:32 54624 [Note] InnoDB: Highest supported file format is Barracuda.
2017-11-01 02:13:32 54624 [Note] InnoDB: Log scan progressed past the checkpoint lsn 1369691345242
2017-11-01 02:13:32 54624 [Note] InnoDB: Database was not shutdown normally!
2017-11-01 02:13:32 54624 [Note] InnoDB: Starting crash recovery.
2017-11-01 02:13:32 54624 [Note] InnoDB: Reading tablespace information from the .ibd files...
2017-11-01 02:13:32 54624 [Note] InnoDB: Restoring possible half-written data pages 
2017-11-01 02:13:32 54624 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Doing recovery: scanned up to log sequence number 1369696587776
InnoDB: Doing recovery: scanned up to log sequence number 1369696924694
InnoDB: Transaction 1704501027 was in the XA prepared state.
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 0 row operations to undo
InnoDB: Trx id counter is 1704501504
2017-11-01 02:13:33 54624 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
InnoDB: Apply batch completed

How to repeat:
I can't repeat it
[1 Nov 2017 10:37] MySQL Verification Team
Thank you for the bug report. Please comment when you are able to provide a repeatable test case.
[16 Nov 2017 2:53] hao chen
today(2017-11-16) this MySQL crash again

21:15:54 UTC - mysqld got signal 4 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

key_buffer_size=268435456
read_buffer_size=1048576
max_used_connections=2637
max_threads=3000
thread_count=56
connection_count=56
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6444534 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7c21bd10
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fd59fc70e18 thread_stack 0x40000
/usr/bin/mysqld(my_print_stacktrace+0x35)[0x8d6f55]
/usr/bin/mysqld(handle_fatal_signal+0x494)[0x664af4]
/lib64/libpthread.so.0[0x31fd40f7e0]
/lib64/libc.so.6(__wcscoll_l+0x709)[0x31fd099819]
/usr/bin/mysqld(my_net_write+0xe0)[0x65a7e0]
/usr/bin/mysqld(_ZN8Protocol5writeEv+0x1e)[0x65dade]
/usr/bin/mysqld(_ZN8Protocol24send_result_set_metadataEP4ListI4ItemEj+0x335)[0x65ecb5]
/usr/bin/mysqld(_Z17show_slave_statusP3THDP11Master_info+0x2f33)[0x8a3133]
/usr/bin/mysqld(_Z21mysql_execute_commandP3THD+0x1081)[0x6e0881]
/usr/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x378)[0x6e4598]
/usr/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x110e)[0x6e574e]
/usr/bin/mysqld(_Z24do_handle_one_connectionP3THD+0xcf)[0x6b1d4f]
/usr/bin/mysqld(handle_one_connection+0x47)[0x6b1e77]
/usr/bin/mysqld(pfs_spawn_thread+0x12d)[0xb0921d]
/lib64/libpthread.so.0[0x31fd407aa1]
/lib64/libc.so.6(clone+0x6d)[0x31fd0e8aad]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fd4e4004a70): is an invalid pointer
Connection ID (thread ID): 14393214
Status: NOT_KILLED
[16 Nov 2017 6:47] MySQL Verification Team
That last crash is signal 4 (SIGILL) which is illegal instruction. Unrelated to first crash report.
I think either the mysqld binary is corrupt or the server memory/cpu failing or something.

Check if you can see more with:

addr2line  --demangle --pretty-print --inlines --addresses --functions --basenames --exe=/usr/bin/mysqld 0x8d6f55 0x664af4 0x65a7e0 0x65dade 0x65ecb5 0x8a3133 0x6e0881 0x6e4598 0x6e574e 0x6b1d4f 0x6b1e77 0xb0921d
0x00000000008d6f55: my_print_stacktrace at stacktrace.c:227
0x0000000000664af4: handle_fatal_signal at signal_handler.cc:162
0x000000000065a7e0: my_net_write at net_serv.cc:323
0x000000000065dade: Protocol::write() at protocol.cc:818
0x000000000065ecb5: Protocol::send_result_set_metadata(List<Item>*, unsigned int) at protocol.cc:788
0x00000000008a3133: show_slave_status(THD*, Master_info*) at rpl_slave.cc:3084
0x00000000006e0881: mysql_execute_command(THD*) at sql_parse.cc:2857
0x00000000006e4598: mysql_parse(THD*, char*, unsigned int, Parser_state*) at sql_parse.cc:6439
0x00000000006e574e: Parser_state::reset(char*, unsigned int) at sql_lex.h:2849
 (inlined by) dispatch_command(enum_server_command, THD*, char*, unsigned int) at sql_parse.cc:1453
0x00000000006b1d4f: do_handle_one_connection(THD*) at sql_connect.cc:982
0x00000000006b1e77: handle_one_connection at sql_connect.cc:901
0x0000000000b0921d: pfs_spawn_thread at pfs.cc:1863

gdb /usr/bin/mysqld
disas 0x000000000065a7e0
(gdb) disas 0x000000000065a7e0
Dump of assembler code for function my_net_write(NET*, uchar const*, size_t):
   0x000000000065a700 <+0>:     push   %rbp
   0x000000000065a701 <+1>:     xor    %eax,%eax
   0x000000000065a703 <+3>:     mov    %rsp,%rbp
   0x000000000065a706 <+6>:     push   %r14
   0x000000000065a708 <+8>:     mov    %rsi,%r14
   0x000000000065a70b <+11>:    push   %r13
   0x000000000065a70d <+13>:    mov    %rdx,%r13
   0x000000000065a710 <+16>:    push   %r12
   0x000000000065a712 <+18>:    push   %rbx
   0x000000000065a713 <+19>:    mov    %rdi,%rbx
   0x000000000065a716 <+22>:    sub    $0x10,%rsp
   0x000000000065a71a <+26>:    cmpq   $0x0,(%rdi)
   0x000000000065a71e <+30>:    je     0x65a75e <my_net_write(NET*, uchar const*, size_t)+94>
   0x000000000065a720 <+32>:    cmp    $0xfffffe,%rdx
   0x000000000065a727 <+39>:    lea    -0x30(%rbp),%r12
   0x000000000065a72b <+43>:    jbe    0x65a79b <my_net_write(NET*, uchar const*, size_t)+155>
   0x000000000065a72d <+45>:    mov    0x60(%rbx),%eax
   0x000000000065a730 <+48>:    mov    $0x4,%edx
   0x000000000065a735 <+53>:    mov    %r12,%rsi
   0x000000000065a738 <+56>:    mov    %rbx,%rdi
   0x000000000065a73b <+59>:    movb   $0xff,-0x30(%rbp)
   0x000000000065a73f <+63>:    movb   $0xff,-0x2f(%rbp)
   0x000000000065a743 <+67>:    movb   $0xff,-0x2e(%rbp)
   0x000000000065a747 <+71>:    mov    %al,-0x2d(%rbp)
   0x000000000065a74a <+74>:    add    $0x1,%eax
   0x000000000065a74d <+77>:    mov    %eax,0x60(%rbx)
   0x000000000065a750 <+80>:    callq  0x65a5c0 <net_write_buff(NET*, uchar const*, ulong)>
   0x000000000065a755 <+85>:    test   %al,%al
   0x000000000065a757 <+87>:    je     0x65a770 <my_net_write(NET*, uchar const*, size_t)+112>
   0x000000000065a759 <+89>:    mov    $0x1,%eax
   0x000000000065a75e <+94>:    add    $0x10,%rsp
   0x000000000065a762 <+98>:    pop    %rbx
   0x000000000065a763 <+99>:    pop    %r12
   0x000000000065a765 <+101>:   pop    %r13
   0x000000000065a767 <+103>:   pop    %r14
   0x000000000065a769 <+105>:   leaveq
   0x000000000065a76a <+106>:   retq
   0x000000000065a76b <+107>:   nopl   0x0(%rax,%rax,1)
   0x000000000065a770 <+112>:   mov    $0xffffff,%edx
   0x000000000065a775 <+117>:   mov    %r14,%rsi
   0x000000000065a778 <+120>:   mov    %rbx,%rdi
   0x000000000065a77b <+123>:   callq  0x65a5c0 <net_write_buff(NET*, uchar const*, ulong)>
   0x000000000065a780 <+128>:   test   %al,%al
   0x000000000065a782 <+130>:   jne    0x65a759 <my_net_write(NET*, uchar const*, size_t)+89>
   0x000000000065a784 <+132>:   sub    $0xffffff,%r13
   0x000000000065a78b <+139>:   add    $0xffffff,%r14
   0x000000000065a792 <+146>:   cmp    $0xfffffe,%r13
   0x000000000065a799 <+153>:   ja     0x65a72d <my_net_write(NET*, uchar const*, size_t)+45>
   0x000000000065a79b <+155>:   mov    %r13d,%eax
   0x000000000065a79e <+158>:   mov    $0x4,%edx
   0x000000000065a7a3 <+163>:   mov    %r12,%rsi
   0x000000000065a7a6 <+166>:   shr    $0x8,%eax
   0x000000000065a7a9 <+169>:   mov    %rbx,%rdi
   0x000000000065a7ac <+172>:   mov    %r13b,-0x30(%rbp)
   0x000000000065a7b0 <+176>:   mov    %al,-0x2f(%rbp)
   0x000000000065a7b3 <+179>:   mov    %r13,%rax
   0x000000000065a7b6 <+182>:   shr    $0x10,%rax
   0x000000000065a7ba <+186>:   mov    %al,-0x2e(%rbp)
   0x000000000065a7bd <+189>:   mov    0x60(%rbx),%eax
   0x000000000065a7c0 <+192>:   mov    %al,-0x2d(%rbp)
   0x000000000065a7c3 <+195>:   add    $0x1,%eax
   0x000000000065a7c6 <+198>:   mov    %eax,0x60(%rbx)
   0x000000000065a7c9 <+201>:   callq  0x65a5c0 <net_write_buff(NET*, uchar const*, ulong)>
   0x000000000065a7ce <+206>:   test   %al,%al
   0x000000000065a7d0 <+208>:   jne    0x65a759 <my_net_write(NET*, uchar const*, size_t)+89>
   0x000000000065a7d2 <+210>:   mov    %r13,%rdx
   0x000000000065a7d5 <+213>:   mov    %r14,%rsi
   0x000000000065a7d8 <+216>:   mov    %rbx,%rdi
   0x000000000065a7db <+219>:   callq  0x65a5c0 <net_write_buff(NET*, uchar const*, ulong)>
>>  0x000000000065a7e0 <+224>:   test   %al,%al
   0x000000000065a7e2 <+226>:   setne  %al
   0x000000000065a7e5 <+229>:   add    $0x10,%rsp
   0x000000000065a7e9 <+233>:   pop    %rbx
   0x000000000065a7ea <+234>:   pop    %r12
   0x000000000065a7ec <+236>:   pop    %r13
   0x000000000065a7ee <+238>:   pop    %r14
   0x000000000065a7f0 <+240>:   leaveq
   0x000000000065a7f1 <+241>:   retq
End of assembler dump.
[16 Nov 2017 6:48] MySQL Verification Team
SHOW SLAVE STATUS;    does this always crash?
[16 Nov 2017 9:29] hao chen
Now, I can not run SHOW SLAVE STATUS \G. I will run it tonight

I have a phenomenon about 'mysqld' binary file

after uncompress tar.gz file 'mysqld' file size is defferent Run for a while.

after uncompress:
ls -l mysqld
-rwxr-xr-x 1 mysql dba 87208139 Jun  3 05:30 mysqld

Run for a while:
ls -l mysqld
-rwxr-xr-x 1 root root 87987531 Aug 15 16:42 mysqld