Bug #88182 crash during shutdown when start GR executed if max connections already opened
Submitted: 23 Oct 2017 7:15 Modified: 13 Dec 2017 15:16
Reporter: Ramana Yeruva Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S3 (Non-critical)
Version:8.0.4 OS:Any
Assigned to: CPU Architecture:Any

[23 Oct 2017 7:15] Ramana Yeruva
Description:
This bug is observed accidentally while reproducing Bug#26771530. if shutdown executed when start group_replication command executed if max_connections already opened in server,then observing crash

How to repeat:
initialize database:
./mysqld --no-defaults -uroot --basedir=../ --datadir=./data --gdb --enforce_gtid_consistency=ON --gtid_mode=ON --log_bin=1 --log_slave_updates=ON --master_info_repository=TABLE --relay_log_info_repository=TABLE --transaction_write_set_extraction=XXHASH64 --plugin-load=authentication_pam.so --server-id=1 --binlog_checksum=NONE --binlog_format=ROW --transaction_write_set_extraction=XXHASH64 --loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" --loose-group_replication_start_on_boot=off --loose-group_replication_local_address="127.0.0.1:24901" --loose-group_replication_group_seeds="127.0.0.1:24901,127.0.0.1:24902,127.0.0.1:24903" --loose-group_replication_bootstrap_group=off --loose-group_replication_single_primary_mode=FALSE --gdb --log-error=$PWD/mysql1.err --initialize-insecure

start server:
./mysqld --no-defaults -uroot --basedir=../ --datadir=./data --gdb --enforce_gtid_consistency=ON --gtid_mode=ON --log_bin=1 --log_slave_updates=ON --master_info_repository=TABLE --relay_log_info_repository=TABLE --transaction_write_set_extraction=XXHASH64 --plugin-load=authentication_pam.so --server-id=1 --binlog_checksum=NONE --binlog_format=ROW --transaction_write_set_extraction=XXHASH64 --loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" --loose-group_replication_start_on_boot=off --loose-group_replication_local_address="127.0.0.1:24901" --loose-group_replication_group_seeds="127.0.0.1:24901,127.0.0.1:24902,127.0.0.1:24903" --loose-group_replication_bootstrap_group=off --loose-group_replication_single_primary_mode=FALSE --gdb --log-error=$PWD/mysql1.err

now execute below commands from one mysql terminal:
set password = 'root';
set @@global.max_connections=10;
INSTALL PLUGIN group_replication SONAME 'group_replication.so';
SET GLOBAL group_replication_bootstrap_group=ON;

open more than 10 connections with below script:
cat log_con1.sh 
for i in {1..4}
do
    echo "output: $i"
    ./mysql -uroot -proot -e "select version();select sleep(100);" &
    ./mysql -uroot -proot -e "select version();select sleep(100);" &
    ./mysql -uroot -proot -e "select version();select sleep(100);" &
done

now from terminal one,execute start group_replication command and execute shutdown command.observer server crashes:

2017-10-23T07:11:35.614064Z 6 [Note] [000000] Plugin group_replication reported: 'Current debug options are: 'GCS_DEBUG_NONE'.'
2017-10-23T07:11:35.614225Z 6 [ERROR] [000000] Plugin group_replication reported: 'Can't establish a internal server connection to execute plugin operations'
2017-10-23T07:11:35.614249Z 6 [Note] [000000] Plugin group_replication reported: 'Requesting to leave the group despite of not being a member'
2017-10-23T07:11:35.614258Z 6 [ERROR] [000000] Plugin group_replication reported: 'Error calling group communication interfaces while trying to leave the group'
2017-10-23T07:11:47.039306Z 0 [Note] [003690] Giving 11 client threads a chance to die gracefully
2017-10-23T07:11:47.039376Z 0 [Note] [003740] Shutting down slave threads
2017-10-23T07:11:47.039396Z 0 [Note] [003677] Event Scheduler: Killing the scheduler thread, thread id 4
2017-10-23T07:11:47.039410Z 0 [Note] [003673] Event Scheduler: Waiting for the scheduler thread to reply
2017-10-23T07:11:47.039520Z 0 [Note] [003672] Event Scheduler: Stopped
2017-10-23T07:11:49.039602Z 0 [Note] [003741] Forcefully disconnecting 0 remaining clients
2017-10-23T07:11:49.039641Z 0 [Note] [003666] Event Scheduler: Purging the queue. 0 events
2017-10-23T07:11:49.040349Z 0 [Note] [000000] InnoDB: FTS optimize thread exiting.
2017-10-23T07:11:49.141499Z 0 [Note] [003743] Binlog end
2017-10-23T07:11:49.142719Z 0 [Note] [004356] Shutting down plugin 'group_replication'
2017-10-23T07:11:49.142779Z 0 [Note] [000000] Plugin group_replication reported: 'All Group Replication server observers have been successfully unregistered'
2017-10-23T07:11:49.142800Z 0 [ERROR] [004095] Plugin group_replication did not deinitialize 1 threads
2017-10-23T07:11:49.142810Z 0 [ERROR] [004090] Killing thread 139727433148160

^C
[1]+  Segmentation fault      (core dumped) ./mysqld --no-defaults -uroot --basedir=../ --datadir=./data --gdb --enforce_gtid_consistency=ON --gtid_mode=ON --log_bin=1 --log_slave_updates=ON --master_info_repository=TABLE --relay_log_info_repository=TABLE --transaction_write_set_extraction=XXHASH64 --plugin-load=authentication_pam.so --server-id=1 --binlog_checksum=NONE --binlog_format=ROW --transaction_write_set_extraction=XXHASH64 --loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" --loose-group_replication_start_on_boot=off --loose-group_replication_local_address="127.0.0.1:24901" --loose-group_replication_group_seeds="127.0.0.1:24901,127.0.0.1:24902,127.0.0.1:24903" --loose-group_replication_bootstrap_group=off --loose-group_replication_single_primary_mode=FALSE --gdb --log-error=$PWD/mysql1.err

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./mysqld --no-defaults -uroot --basedir=../ --datadir=./data --gdb --enforce_gt'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f1505cc1a91 in pthread_cancel () from /lib64/libpthread.so.0
Missing separate debuginfos, use: debuginfo-install audit-libs-2.6.5-3.el7_3.1.x86_64 glibc-2.17-196.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.11.3-49.el7.x86_64 libaio-0.3.109-12.el7.x86_64 libcap-ng-0.7.3-5.el7.x86_64 libcom_err-1.42.9-4.el7.x86_64 libgcc-4.8.2-16.2.el7_0.x86_64 libselinux-2.5-6.el7.x86_64 libstdc++-4.8.2-16.2.el7_0.x86_64 nss-softokn-freebl-3.16.2.3-1.el7_0.x86_64 numactl-libs-2.0.9-2.el7.x86_64 openssl-libs-1.0.2k-8.0.1.el7.x86_64 pam-1.1.8-9.el7.x86_64 pcre-8.32-12.el7.x86_64 zlib-1.2.7-13.el7.x86_64
(gdb) bt
#0  0x00007f1505cc1a91 in pthread_cancel () from /lib64/libpthread.so.0
#1  0x0000000001821c3d in my_thread_cancel (thread=<optimized out>) at ../../mysqlcom-pro-8.0.4-rc/mysys/my_thread.cc:119
#2  0x0000000000e0fb4f in kill (plugin=0x71f4288, this=0x2e96260 <server_session_threads>) at ../../mysqlcom-pro-8.0.4-rc/sql/srv_session.cc:261
#3  Srv_session::check_for_stale_threads (plugin=plugin@entry=0x71f4288) at ../../mysqlcom-pro-8.0.4-rc/sql/srv_session.cc:798
#4  0x0000000000b477e0 in plugin_deinitialize (plugin=plugin@entry=0x71f4288, ref_check=ref_check@entry=true) at ../../mysqlcom-pro-8.0.4-rc/sql/sql_plugin.cc:1081
#5  0x0000000000b49f45 in reap_plugins () at ../../mysqlcom-pro-8.0.4-rc/sql/sql_plugin.cc:1146
#6  0x0000000000b4bc5d in plugin_shutdown () at ../../mysqlcom-pro-8.0.4-rc/sql/sql_plugin.cc:1929
#7  0x0000000000a6d0d9 in clean_up (print_message=print_message@entry=true) at ../../mysqlcom-pro-8.0.4-rc/sql/mysqld.cc:2000
#8  0x0000000000a7a58a in mysqld_main (argc=<optimized out>, argv=<optimized out>) at ../../mysqlcom-pro-8.0.4-rc/sql/mysqld.cc:6186
#9  0x00007f1503fd1c05 in __libc_start_main () from /lib64/libc.so.6
#10 0x0000000000a646d5 in _start ()

bug verified with below commit on 5.7 and 8.0 branches:
cat ../docs/INFO_SRC 
commit: a7f8955df172ed2bc8af10ed383dbd450378df97
date: 2017-10-20 19:56:43 +0200
build-date: 2017-10-21 17:30:34 +0200
short: a7f8955
branch: mysql-8.0

MySQL source 8.0.4

cat ../../../5.7.21/mysql-advanced/docs/INFO_SRC 
commit: b4de6344d18962a7c63d085c1de009295ed99a86
date: 2017-10-20 19:03:55 +0200
build-date: 2017-10-21 18:01:13 +0200
short: b4de634
branch: mysql-5.7

MySQL source 5.7.21
[13 Dec 2017 15:16] David Moss
Posted by developer:
 
Thank you for your feedback, this has been fixed in upcoming versions and the following was added to the 5.7.21  / 8.0.4 change logs:
Group Replication executes internal operations on the server during start and stop, such as enabling or disabling read only mode, using an internal session. When this internal session was opened, if the total number of sessions exceeded the number of permitted open sessions set by max_connections, the operation was failing as expected but a thread was left behind, which later would cause issues.
[1 Feb 2018 14:03] David Moss
Now published.