Bug #99679 stop group_replication will assert
Submitted: 25 May 2020 2:32 Modified: 26 May 2020 9:14
Reporter: phoenix Zhang (OCA) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.18 OS:Any
Assigned to: MySQL Verification Team CPU Architecture:Any
Tags: group_replication

[25 May 2020 2:32] phoenix Zhang
Description:
when do STOP GROUP_REPLICATION, it do assert. The error log show below info, while it cannot repeat any more.

2020-05-16T13:59:16.633916Z 2840600 [Note] [MY-011650] [Repl] Plugin group_replication reported: 'Plugin 'group_replication' is stopping.'
2020-05-16T13:59:16.634027Z 2840600 [Note] [MY-011647] [Repl] Plugin group_replication reported: 'Going to wait for view modification'
2020-05-16T13:59:16.634934Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Re-using server node 0 host localhost'
2020-05-16T13:59:16.634992Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Re-using server node 1 host localhost'
2020-05-16T13:59:16.635024Z 0 [Note] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Installed site start={c53fb 487 0} boot_key={c53fb 476 1} event_horizon=10 node 4294967295'
2020-05-16T13:59:20.093598Z 0 [Note] [MY-011504] [Repl] Plugin group_replication reported: 'Group membership changed: This member has left the group.'
mysqld: /opt/myrocks/include/thr_mutex.h:186: int my_mutex_lock(my_mutex_t*, const char*, uint): Assertion `mp->m_u.m_safe_ptr != __null' failed.
13:59:21 UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x7fc5d800e450
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fc8c43f7c18 thread_stack 0x46000
/opt/myrocks/DEBUG/runtime_output_directory/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x55) [0x4c740ee]
/opt/myrocks/DEBUG/runtime_output_directory/mysqld(handle_fatal_signal+0x2ce) [0x3a84f64]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390) [0x7fc96b1d8390]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38) [0x7fc9691ec428]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a) [0x7fc9691ee02a]
/lib/x86_64-linux-gnu/libc.so.6(+0x2dbd7) [0x7fc9691e4bd7]
/lib/x86_64-linux-gnu/libc.so.6(+0x2dc82) [0x7fc9691e4c82]
/opt/mysql-8.0.18/DEBUG/plugin_output_directory/group_replication.so(+0x2a54e9) [0x7fc93834e4e9]
/opt/mysql-8.0.18/DEBUG/plugin_output_directory/group_replication.so(+0x2a5796) [0x7fc93834e796]
/opt/mysql-8.0.18/DEBUG/plugin_output_directory/group_replication.so(Certifier_broadcast_thread::terminate()+0x107) [0x7fc93834f303]
/opt/mysql-8.0.18/DEBUG/plugin_output_directory/group_replication.so(Certifier::terminate()+0x65) [0x7fc938351b19]
/opt/mysql-8.0.18/DEBUG/plugin_output_directory/group_replication.so(Certification_handler::handle_action(Pipeline_action*)+0x168) [0x7fc938393978]
/opt/mysql-8.0.18/DEBUG/plugin_output_directory/group_replication.so(Event_handler::next(Pipeline_action*)+0x4a) [0x7fc938392d96]
/opt/mysql-8.0.18/DEBUG/plugin_output_directory/group_replication.so(Event_cataloger::handle_action(Pipeline_action*)+0x23) [0x7fc938398eb5]
/opt/mysql-8.0.18/DEBUG/plugin_output_directory/group_replication.so(Applier_module::applier_thread_handle()+0x951) [0x7fc938343609]
/opt/mysql-8.0.18/DEBUG/plugin_output_directory/group_replication.so(+0x2985b4) [0x7fc9383415b4]
/opt/mysql-8.0.18/DEBUG/runtime_output_directory/mysqld() [0x54625bf]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7fc96b1ce6ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fc9692be41d]

It's a 3 nodes group_replication cluster, in the same server with different port. 

The server is compile from source code, the compile command is:

cmake .. -DCMAKE_BUILD_TYPE=Debug -DWITH_ZLIB=system -DWITH_BOOST=../boost/

The config files of one is:

$ cat my1.cnf
[mysqld]
basedir = /usr/local/mysql-8.0.18
datadir = /usr/local/mysql-8.0.18/data13000
socket=/usr/local/mysql-8.0.18/data13000/mysql.sock
port = 13000
log-bin=                    server-binary-log
relay-log=                  server-relay-log

binlog-checksum=            NONE
enforce-gtid-consistency
gtid-mode=                  on  

report-host=                127.0.0.1
report-user=                root

master-retry-count=         10  
skip-slave-start

## mgr config
loose-group_replication_start_on_boot= OFF 
loose-group_replication_single_primary_mode= ON
loose-group_replication_enforce_update_everywhere_checks= FALSE
loose-group_replication_recovery_get_public_key= TRUE
loose-group_replication_exit_state_action= READ_ONLY
loose-group_replication_consistency= BEFORE_AND_AFTER
loose-group_replication_local_address=127.0.0.1:33061
loose-group_replication_group_seeds=127.0.0.1:33061,127.0.0.1:33062,127.0.0.1:33063
loose-group_replication_group_name=aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaa

How to repeat:
it does not repeat any more.
[26 May 2020 1:52] MySQL Verification Team
Hi,

I cannot reproduce this. What type of hardware are you running this of? ECC-registered memory? This looks like memory corruption to me, a hardware and not software error but I can't be sure.

We'd need bit more data and way to reproduce this
Bogdan
[26 May 2020 5:39] phoenix Zhang
Hi.

It's docker of official ubuntu 16.04. This happen in accident, and cannot repeat any more.
[26 May 2020 9:14] MySQL Verification Team
Hi,

> This happen in accident, and cannot repeat any more.

Nothing we can do about it then. This kind of error is 99.9% hardware error, bit-flip, or something similar. Without the ability to reproduce the provided info is not enough for us to do anything constructive :(

Thanks 
Bogdan