Bug #86142 GR+XA: Assert `! is_set() at sql_error.cc:406 when member is in ERROR/RECOVERING
Submitted: 30 Apr 2017 19:06 Modified: 13 Oct 2017 17:58
Reporter: Narendra Singh Chauhan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S3 (Non-critical)
Version:5.7.19 OS:Any
Assigned to: CPU Architecture:Any

[30 Apr 2017 19:06] Narendra Singh Chauhan
Description:
Scenario: Test commit/rollback of XA transaction when GR goes to ERROR state or when it is in RECOVERING state.
Expected Output:- The XA transaction should fail gracefully due to before_commit hook. i.e.
 Run function 'before_commit' in plugin 'group_replication' failed
Actual Output:- Assertion failure.

mysqld.1.err:-
==============
2017-04-30T18:26:02.367308Z 21 [ERROR] Plugin group_replication reported: 'Transaction cannot be executed while Group Replication is on ERROR state. Check for errors and restart the plugin'
2017-04-30T18:26:02.367328Z 21 [ERROR] Run function 'before_commit' in plugin 'group_replication' failed
mysqld: /Narendra/mysql_work/git_repo/mysql-trunk-2/sql/sql_error.cc:406: void Diagnostics_area::set_ok_status(ulonglong, ulonglong, const char*): Assertion `! is_set()' failed.
18:26:02 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=7
max_threads=151
thread_count=6
connection_count=6
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68251 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7f45a4032b40
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f4611757dd0 thread_stack 0x40000
/Narendra/mysql_work/git_repo/mysql-trunk-2/install/bin/mysqld(my_print_stacktrace+0x6b) [0x34cc8b2]
/Narendra/mysql_work/git_repo/mysql-trunk-2/install/bin/mysqld(handle_fatal_signal+0x763) [0x22364df]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0) [0x7f461915acb0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35) [0x7f4617f58035]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x17b) [0x7f4617f5b79b]
/lib/x86_64-linux-gnu/libc.so.6(+0x2ee1e) [0x7f4617f50e1e]
/lib/x86_64-linux-gnu/libc.so.6(+0x2eec2) [0x7f4617f50ec2]
/Narendra/mysql_work/git_repo/mysql-trunk-2/install/bin/mysqld(Diagnostics_area::set_ok_status(unsigned long long, unsigned long long, char const*)+0x9c) [0x1b56514]
/Narendra/mysql_work/git_repo/mysql-trunk-2/install/bin/mysqld(my_ok(THD*, unsigned long long, unsigned long long, char const*)+0x84) [0x1b46114]
/Narendra/mysql_work/git_repo/mysql-trunk-2/install/bin/mysqld(Sql_cmd_xa_commit::execute(THD*)+0xc6) [0x1e2275c]
/Narendra/mysql_work/git_repo/mysql-trunk-2/install/bin/mysqld(mysql_execute_command(THD*, bool)+0xd078) [0x1bfd580]
/Narendra/mysql_work/git_repo/mysql-trunk-2/install/bin/mysqld(mysql_parse(THD*, Parser_state*)+0xda2) [0x1c02255]
/Narendra/mysql_work/git_repo/mysql-trunk-2/install/bin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x2172) [0x1beb4e1]
/Narendra/mysql_work/git_repo/mysql-trunk-2/install/bin/mysqld(do_command(THD*)+0xad5) [0x1be8b5b]
/Narendra/mysql_work/git_repo/mysql-trunk-2/install/bin/mysqld() [0x2218259]
/Narendra/mysql_work/git_repo/mysql-trunk-2/install/bin/mysqld() [0x35303c7]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f4619152e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f461801536d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f45a402edc0): XA COMMIT '1'
Connection ID (thread ID): 21
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file

gdb output:-
=============
Thread 1 (Thread 0x7f4611758700 (LWP 3638)):
#0  __pthread_kill (threadid=<optimized out>, signo=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:63
#1  0x00000000034ccaa9 in my_write_core () at /Narendra/mysql_work/git_repo/mysql-trunk-2/mysys/stacktrace.cc:291
#2  0x00000000022369f1 in handle_fatal_signal () at /Narendra/mysql_work/git_repo/mysql-trunk-2/sql/signal_handler.cc:231
#3  <signal handler called>
#4  0x00007f4617f58035 in __GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#5  0x00007f4617f5b79b in __GI_abort () at abort.c:91
#6  0x00007f4617f50e1e in __assert_fail_base (fmt=<optimized out>, assertion=0x4636f3e "! is_set()", file=0x4636d58 "/Narendra/mysql_work/git_repo/mysql-trunk-2/sql/sql_error.cc", line=<optimized out>, function=<optimized out>) at assert.c:94
#7  0x00007f4617f50ec2 in __GI___assert_fail (assertion=0x4636f3e "! is_set()", file=0x4636d58 "/Narendra/mysql_work/git_repo/mysql-trunk-2/sql/sql_error.cc", line=406, function=0x4637500 "void Diagnostics_area::set_ok_status(ulonglong, ulonglong, const char*)") at assert.c:103
#8  0x0000000001b56514 in Diagnostics_area::set_ok_status(unsigned long long, unsigned long long, char const*) () at /Narendra/mysql_work/git_repo/mysql-trunk-2/sql/sql_error.cc:406
#9  0x0000000001b46114 in my_ok(THD*, unsigned long long, unsigned long long, char const*) () at /Narendra/mysql_work/git_repo/mysql-trunk-2/sql/sql_class.h:4137
#10 0x0000000001e2275c in Sql_cmd_xa_commit::execute(THD*) () at /Narendra/mysql_work/git_repo/mysql-trunk-2/sql/xa.cc:539
#11 0x0000000001bfd580 in mysql_execute_command(THD*, bool) () at /Narendra/mysql_work/git_repo/mysql-trunk-2/sql/sql_parse.cc:4550
#12 0x0000000001c02255 in mysql_parse(THD*, Parser_state*) () at /Narendra/mysql_work/git_repo/mysql-trunk-2/sql/sql_parse.cc:5338
#13 0x0000000001beb4e1 in dispatch_command(THD*, COM_DATA const*, enum_server_command) () at /Narendra/mysql_work/git_repo/mysql-trunk-2/sql/sql_parse.cc:1593
#14 0x0000000001be8b5b in do_command(THD*) () at /Narendra/mysql_work/git_repo/mysql-trunk-2/sql/sql_parse.cc:1180
#15 0x0000000002218259 in handle_connection () at /Narendra/mysql_work/git_repo/mysql-trunk-2/sql/conn_handler/connection_handler_per_thread.cc:322
#16 0x00000000035303c7 in pfs_spawn_thread () at /Narendra/mysql_work/git_repo/mysql-trunk-2/storage/perfschema/pfs.cc:2407
#17 0x00007f4619152e9a in start_thread (arg=0x7f4611758700) at pthread_create.c:308
#18 0x00007f461801536d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)

#19 0x0000000000000000 in ?? ()

How to repeat:
Details:-
=========
Checked on MySQL Versions: 8.0.2 and 5.7.19

Steps to repro:-
================
Run command:- ./mtr group_replication.group_replication_xa_is_set.test

Where,
$ cat group_replication_xa_is_set.test 
--source include/have_debug.inc
--let $group_replication_group_name= aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa
--source ../inc/have_group_replication_plugin.inc
--let $rpl_server_count= 2
--source ../inc/group_replication.inc

--let $rpl_connection_name= server1
--source include/rpl_connection.inc
--echo # So, that we can send server1 to ERROR state later.

SET SQL_LOG_BIN=0;
CREATE TABLE test.t2 (a INT PRIMARY KEY);
SET SQL_LOG_BIN=1;

connect (server1_conn2, 127.0.0.1, root, ,test, $MASTER_MYPORT,);
--let $conn_id=`SELECT connection_id()`
CREATE TABLE test.t1 (a INT PRIMARY KEY);
XA START '1';
INSERT INTO t1 VALUES (1);
XA END '1';
XA PREPARE '1';
--disconnect server1_conn2

--let $rpl_connection_name= server2
--source include/rpl_connection.inc
CREATE TABLE test.t2 (a INT PRIMARY KEY);

--let $rpl_connection_name= server1
--source include/rpl_connection.inc
select * from performance_schema.replication_group_members;

--let $group_replication_member_state= ERROR
--source ../inc/gr_wait_for_member_state.inc
select * from performance_schema.replication_group_members;

--echo
--echo == CRASH HERE ==
XA COMMIT '1';

--die "OK TO STOP HERE"
[13 Oct 2017 17:58] Paul DuBois
Posted by developer:
 
Fixed in 8.0.4, 9.0.0.

For XA COMMIT, precommit handling could set an error in the
diagnostics area that was not reported correctly on the calling side,
causing an assertion to be raised.