Bug #15047 server crash when compiling without transaction support
Submitted: 18 Nov 2005 12:00 Modified: 2 Dec 2005 20:30
Reporter: Kristian Nielsen Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: Compiling Severity:S3 (Non-critical)
Version:Newest 5.0.17 from bk tree OS:x86 Debian Sarge
Assigned to: Ramil Kalimullin CPU Architecture:Any

[18 Nov 2005 12:00] Kristian Nielsen
I get server crashes on x86 Debian Sarge during mysql-test-run.pl.

The crashes happen seemingly randomly depending on compile and test options, but they are 100% repeatable for each combination.

Look at this gdb session:

(gdb) bt
#0  0x4010c861 in kill () from /lib/libc.so.6
#1  0x4002e2a9 in pthread_kill () from /lib/libpthread.so.0
#2  0x0828396b in write_core (sig=11) at stacktrace.c:220
#3  0x0815f208 in handle_segfault (sig=11) at mysqld.cc:2054
#4  0x40031fe1 in __pthread_sighandler () from /lib/libpthread.so.0
#5  0x4010c668 in killpg () from /lib/libc.so.6
#6  0x0822d7ae in ha_enable_transaction (thd=0x8551070, on=true)
    at handler.cc:1913
#7  0x0814dcdd in THD::init_for_queries (this=0x8551070) at sql_class.cc:319
#8  0x08172574 in handle_one_connection (arg=0x8551070) at sql_parse.cc:1139
#9  0x4002af3c in pthread_start_thread () from /lib/libpthread.so.0
#10 0x4019d8ba in clone () from /lib/libc.so.6
(gdb) frame 6
#6  0x0822d7ae in ha_enable_transaction (thd=0x8551070, on=true)
    at handler.cc:1913
1913        error= end_trans(thd, COMMIT);
(gdb) p/x thd->transaction.xid_state.xa_state
$4 = 0x85520f0

The value of thd->transaction.xid_state.xa_state is  some pointer value, where it should have been a number between 0 and 3 (enum xa_states). I think it has been overwritten by some wild pointer error previously. The segfault actually happens in end_trans() in handler.cc in this code:

  if (thd->transaction.xid_state.xa_state != XA_NOTR)
    my_error(ER_XAER_RMFAIL, MYF(0),

(the array lookup fails obviously). Not sure why the backtrace shows only the parent.

How to repeat:
Compile on Debian Sarge, run mysql-test-run.pl.
[18 Nov 2005 20:55] Will Miles
I struck this one using the 5.0.15 embedded libmysqld and QNX Neutrino 6.3.0 SP1 and 2.  I traced it to sql_class.h; it appears that the constructor for THD::st_transactions zeros and initializes the structure, but it's #ifdef'd out if USING_TRANSACTIONS is false.  However, in THD::init_for_queries, there is a call to ha_enable_transaction that is invoked regardless of the setting of USING_TRANSACTIONS; this in turn calls end_trans, that depends on the (uninitialized) value stored in the st_transactions structure.  Because it's reading uninitialized memory, sometimes it will succeed and sometimes it will fail.

I worked around this one by removing the #ifdef around the THD::st_transactions structure.  I doubt this is the 'correct' fix; perhaps someone with a deeper knowledge of the transaction calls could add an #ifdef in the right place.   Note of course that it will only happen if you compile WITHOUT Berkeley, Innobase, or NDB support (and thusly USING_TRANSACTIONS is not defined).
[20 Nov 2005 15:13] Kristian Nielsen
Indeed, the problem appeared in a compile without transactional engines, and disappears when adding a transactional engine (--innodb).

I think this is somewhat serious for anyone trying to compile a minimal server themselves, since the bug is somewhat difficult to track down, but on the other hand compiling MySQL is discouraged anyway.

The analysis by Will Miles seems a good starting point for a fix.
[21 Nov 2005 7:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

[24 Nov 2005 14:38] Ramil Kalimullin
fixed in 5.0.17
[2 Dec 2005 20:30] Paul DuBois
Noted in 5.0.17 changelog.