Bug #47134 Crash on startup when XA support functions activated by a second engine
Submitted: 4 Sep 2009 14:32 Modified: 4 Sep 2009 20:39
Reporter: Paul McCullagh (Basic Quality Contributor) (SCA)
Status: Verified
Category:Server Severity:S1 (Critical)
Version:3.1.35+ OS:Any
Assigned to: Target Version:
Tags: qc, xa, crash, pbxt, startup, tc.log
Triage: Needs Triage: D1 (Critical)

[4 Sep 2009 14:32] Paul McCullagh
Description:
When the XA support functions are enabled by a second engine (i.e. an engine in addition
to InnoDB) MySQL crashes on startup:

#0	0x001d7607 in TC_LOG_MMAP::recover at log.cc:5282
#1	0x001d7d0e in TC_LOG_MMAP::open at log.cc:4965
#2	0x000fb658 in init_server_components at mysqld.cc:4020
#3	0x00100178 in main at mysqld.cc:4354

line:5282 if (*x && my_hash_insert(&xids, (uchar *)x))

Before this crash, the following assertion fails:

Line 1579, handler.cc:

DBUG_ASSERT(total_ha_2pc == (ulong) opt_bin_log+1); // only InnoDB and binlog

After disabling the assertion, the server crashes in startup as described above.

After removing the "tc.log" file MySQL will startup again without a crash.

How to repeat:
1. Install and build MySQL 5.1.35 or later.

$ cd /home/build/mysql-5.1.35
$ ./configure --prefix=/home/build/test --with-debug=full --with-plugins=max-no-ndb
$ make install

2. Checkout the following PBXT branch:

$ cd ~/repositories/pbxt
$ bzr branch lp:~paul-mccullagh/pbxt/xa-crash

3. Build PBXT for the MySQL version

$ cd ~/repositories/pbxt/xa-crash
$ ./configure --with-mysql=/home/build/mysql-5.1.35
$ make install

4. Run MySQL

At this point you should hit the failed assertion.

After commenting this out, and rebuilding the server should crash on startup as
described.

Suggested fix:
I have not investigated the problem enough to be able to tell if this bug means that XA
support for engines other than InnoDB has not been implemented, or if it simply has not
been tested.

Some feedback in this regard would be welcome.
[4 Sep 2009 15:33] Paul McCullagh
I have found the immediate source of the crash. In the function TC_LOG_MMAP::open() we
have:

Line: 4957, file: log.cc:

    pg->end=(my_xid *)(pg->start + tc_log_page_size);

The type of pg->start is (ulonglong *). The code is assuming the type is (char *). So it
is adding 8*tc_log_page_size to pg->start instead of 1*tc_log_page_size.

This makes me doubt the reliability of this code in general.
[4 Sep 2009 20:22] Miguel Solorzano
miguel@luz:~/dbs$ 5.1/libexec/mysqld
090904 15:20:27 [Note] Plugin 'FEDERATED' is disabled.
090904 15:20:27  InnoDB: Started; log sequence number 0 44233
090904 15:20:27 [Note] PrimeBase XT (PBXT) Engine 1.0.08d RC loaded...
090904 15:20:27 [Note] Paul McCullagh, PrimeBase Technologies GmbH,
http://www.primebase.org
mysqld: handler.cc:1586: int ha_recover(HASH*): Assertion `total_ha_2pc == (ulong)
opt_bin_log+1' failed.
090904 15:20:27 - mysqld got signal 6 ;
[4 Sep 2009 20:39] Miguel Solorzano
Thank you for the bug report. Verified against the last tree source 5.1.
[7 Oct 2009 4:14] Elena Stepanova
I am also getting a crash on 6.0.9; it disappears if i switch off pbxt_support_xa, so I
assume it is a related problem but it happens not on startup but after some queries, but
has somewhat different stack trace, so I thought it might be worth mentioning here as
well.

Stack trace:

#0  0x00002b492993eea3 in pthread_kill () from /lib64/libpthread.so.0
#1  0x000000000065ceec in handle_segfault (sig=11) at mysqld.cc:2671
#2  <signal handler called>
#3  0x00002b492993e51a in pthread_cond_signal@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#4  0x000000000070679c in TC_LOG_MMAP::sync (this=0xf11aa0) at log.cc:7062
#5  0x000000000070d080 in TC_LOG_MMAP::log_xid (this=0xf11aa0,
    thd=<value optimized out>, xid=<value optimized out>) at log.cc:7031
#6  0x000000000075d911 in ha_commit_trans (thd=0x1fa03e0,
    all=<value optimized out>) at handler.cc:1172
#7  0x00000000007f7f5c in trans_commit (thd=0x1fa03e0) at transaction.cc:132
#8  0x000000000066e000 in mysql_execute_command (thd=0x1fa03e0)
    at sql_parse.cc:4026
#9  0x0000000000671d07 in mysql_parse (thd=0x1fa03e0,
    inBuf=0x20201f8 "commit", length=6, found_semicolon=0x4c89a100)
    at sql_parse.cc:5735
#10 0x0000000000672606 in dispatch_command (command=COM_QUERY, thd=0x1fa03e0,
    packet=0x2010bd1 "commit", packet_length=<value optimized out>)
    at sql_parse.cc:1007
#11 0x00000000006730d2 in do_command (thd=0x1fa03e0) at sql_parse.cc:690
#12 0x0000000000666284 in handle_one_connection (arg=0x1fa03e0)
    at sql_connect.cc:1145
#13 0x00002b492993a143 in start_thread () from /lib64/libpthread.so.0
#14 0x00002b492a0738cd in clone () from /lib64/libc.so.6
#15 0x0000000000000000 in ?? ()

# Test case:
# (MySQL server 6.0.9, PBXT 1.0.09)

drop database if exists pbxt_innodb;
create database pbxt_innodb;
use pbxt_innodb;
create table t_pbxt ( i int not null auto_increment, primary key (i), f varchar (128) )
engine = pbxt;
create table t_inno ( i int not null auto_increment, primary key (i), f varchar (128) )
engine = innodb;
start transaction;
insert into t_pbxt (f) values ( 'abc'), ('def' );
insert into t_inno (f) values ( 'abc'), ('def' );
commit;