Bug #28323 Server crashed in xid cache operations
Submitted: 9 May 2007 3:03 Modified: 10 Nov 2008 17:57
Reporter: Yuan WANG Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S2 (Serious)
Version:5.0.38, 5.0.68, 5.1.28 OS:Any
Assigned to: Davi Arnaut CPU Architecture:Any
Tags: crash, xa, xid

[9 May 2007 3:03] Yuan WANG
Description:
This bug is a refresh of #27871.  The title and comments of #27871 are misleading for we didn't know the cause of the problem at that time. After analyzing the coredumps generated by MySQL debug version, the problem becomes more clear. So we think it's better to file another bug report.

We are a blog service provoder. We use multiple MySQL servers in our application. These days, some of our MySQL servers crashed several times. At the worst, a server can crash 5 times a day. The hardware setting of our server is:
  Box: Dell 2950
  CPU: Intel(R) Xeon(R) CPU            5130  @ 2.00GHz
  MEM: 8G
  Disk: Ultra SCISI 320 (Raid 0+1)
  OS: 2.6.16-2-em64t-p4-smp x86_64 GNU/Linux

We can not find a way to repeat the problem definitely. However based on the coredumps, we think the problem should be a defect in xid cache management codes.  Some typical stacktraces from these coredumps a listed as follows.

Stacetrace 1:
#0  0x00002b83d78d9797 in pthread_kill () from /lib/libpthread.so.0
(gdb) WHERE
#0  0x00002b83d78d9797 in pthread_kill () from /lib/libpthread.so.0
#1  0x00000000005aefc5 in handle_segfault (sig=11) at mysqld.cc:2132
#2  <signal handler called>
#3  movelink (array=0x146a0de0, find=259, next_link=4294967295, newlink=412) at
hash.c:272
#4  0x00000000008c782a in my_hash_insert (info=0xd25920, record=0x2aaaabbb7e18
"\023\231") at hash.c:427
#5  0x000000000059de01 in xid_cache_insert (xid_state=0x2aaaabbb7e18) at
sql_class.cc:2186
#6  0x00000000005ca316 in mysql_execute_command (thd=0x2aaaabbb6060) at
sql_parse.cc:4838
#7  0x00000000005cb24c in mysql_parse (thd=0x2aaaabbb6060, inBuf=0x111c92b0 "XA
START 0x31353831323036383237323633343537303738,0x33,0x9913", 
    length=<value optimized out>) at sql_parse.cc:5834
#8  0x00000000005cb73b in dispatch_command (command=<value optimized out>,
thd=0x2aaaabbb6060, packet=<value optimized out>, 
    packet_length=<value optimized out>) at sql_parse.cc:1775
#9  0x00000000005cca82 in do_command (thd=0x2aaaabbb6060) at sql_parse.cc:1557
#10 0x00000000005cd7ad in handle_one_connection (arg=<value optimized out>) at
sql_parse.cc:1188
#11 0x00002b83d78d5f1a in start_thread () from /lib/libpthread.so.0
#12 0x00002b83d8188602 in clone () from /lib/libc.so.6
#13 0x0000000000000000 in ?? ()

Stacetrace 2:
#0  0x00002abe4b515797 in pthread_kill () from /lib/libpthread.so.0
#1  0x00000000005aefc5 in handle_segfault (sig=11) at mysqld.cc:2132
#2  <signal handler called>
#3  0x00000000008ec4d9 in my_hash_sort_bin (cs=<value optimized out>,
key=0xd0e422a1d9b79400 <Address 0xd0e422a1d9b79400 out of bounds>, 
    len=3652686848, nr1=0x5085a6f0, nr2=0x5085a6e8) at ctype-bin.c:283
#4  0x00000000008c65fe in calc_hash (hash=<value optimized out>,
key=0x2aaaaba385e0 "", length=3652686848) at hash.c:45
#5  0x00000000008c6940 in hash_rec_mask (hash=0xd25920, pos=<value optimized
out>, buffmax=512, maxlength=290) at hash.c:171
#6  0x00000000008c6e27 in hash_first (hash=0xd25920, key=0x15c7bbc8 "\023",
length=<value optimized out>, current_record=0x5085a7b4)
    at hash.c:226
#7  0x00000000008c6eee in hash_search (hash=0x1bfa20, key=0x2aaaaba385e0 "",
length=3652686848) at hash.c:192
#8  0x000000000059de6f in xid_cache_search (xid=0x15c7bbc0) at
sql_class.cc:2154
#9  0x00000000005ca0c0 in mysql_execute_command (thd=0x2aaaac8a1d50) at
sql_parse.cc:4830
#10 0x00000000005cb24c in mysql_parse (thd=0x2aaaac8a1d50, inBuf=0x15c7b9e0 "XA
START 0x31363231323036383236383937313938393937,0x32,0x9913", 
    length=<value optimized out>) at sql_parse.cc:5834
#11 0x00000000005cb73b in dispatch_command (command=<value optimized out>,
thd=0x2aaaac8a1d50, packet=<value optimized out>, 
    packet_length=<value optimized out>) at sql_parse.cc:1775
#12 0x00000000005cca82 in do_command (thd=0x2aaaac8a1d50) at sql_parse.cc:1557
#13 0x00000000005cd7ad in handle_one_connection (arg=<value optimized out>) at
sql_parse.cc:1188
#14 0x00002abe4b511f1a in start_thread () from /lib/libpthread.so.0
#15 0x00002abe4bdc4602 in clone () from /lib/libc.so.6
#16 0x0000000000000000 in ?? ()

Stacktrace 3:
#0  0x00002adfeced3797 in pthread_kill () from /lib/libpthread.so.0
#1  0x00000000005aefc5 in handle_segfault (sig=11) at mysqld.cc:2132
#2  <signal handler called>
#3  movelink (array=0x138aab50, find=0, next_link=4294967295, newlink=1) at
hash.c:272
#4  0x00000000008c746b in hash_delete (hash=0xd25920, record=0x2aaab7740388
"��������\023") at hash.c:497
#5  0x000000000059b775 in xid_cache_delete (xid_state=0x2aaab7740388) at
sql_class.cc:2195
#6  0x00000000005c9c89 in mysql_execute_command (thd=0x2aaab773e5d0) at
sql_parse.cc:4935
#7  0x00000000005cb24c in mysql_parse (thd=0x2aaab773e5d0, inBuf=0x13452400 "XA
COMMIT 0x31363331323036383236353139323435373333,0x31,0x9913", 
    length=<value optimized out>) at sql_parse.cc:5834
#8  0x00000000005cb73b in dispatch_command (command=<value optimized out>,
thd=0x2aaab773e5d0, packet=<value optimized out>, 
    packet_length=<value optimized out>) at sql_parse.cc:1775
#9  0x00000000005cca82 in do_command (thd=0x2aaab773e5d0) at sql_parse.cc:1557
#10 0x00000000005cd7ad in handle_one_connection (arg=<value optimized out>) at
sql_parse.cc:1188
#11 0x00002adfececff1a in start_thread () from /lib/libpthread.so.0
#12 0x00002adfed782602 in clone () from /lib/libc.so.6
#13 0x0000000000000000 in ?? ()

How to repeat:
Can not find a way to repeat the problem definitely, however it occurs several times a day in our production system.
[9 May 2007 13:14] Yuan WANG
More information from one of the coredumps.

In one coredump, their are 6 records in xid cache:
(gdb) p *info
$34 = {key_offset = 0, key_length = 0, records = 6, blength = 8, flags = 0, array = {buffer = 0xc9ba30 "����", elements = 7, 
    max_element = 100, alloc_increment = 200, size_of_element = 16}, get_key = 0x5936f0 <xid_get_hash_key>, free = 0x591da0 <xid_free_hash>, 
  charset = 0xc4a780}

The contents of each items are:

(gdb) p *(XID_STATE*)((HASH_LINK*)info->array.buffer + 0)->data
$10 = {xid = {formatID = 39187, gtrid_length = 19, bqual_length = 1, data = "16312069782216844122", '\0' <repeats 107 times>}, 
  xa_state = XA_PREPARED, in_thd = true}
(gdb) p *(XID_STATE*)((HASH_LINK*)info->array.buffer + 1)->data
$11 = {xid = {formatID = 39187, gtrid_length = 18, bqual_length = 1, data = "4712069755377316822", '\0' <repeats 108 times>}, 
  xa_state = XA_ACTIVE, in_thd = true}
(gdb) p *(XID_STATE*)((HASH_LINK*)info->array.buffer + 2)->data
$12 = {xid = {formatID = 39187, gtrid_length = 18, bqual_length = 1, data = "4412069782217653331", '\0' <repeats 108 times>}, 
  xa_state = XA_PREPARED, in_thd = true}
(gdb) p *(XID_STATE*)((HASH_LINK*)info->array.buffer + 3)->data
$13 = {xid = {formatID = 39187, gtrid_length = 19, bqual_length = 1, data = "16212069782219078521", '\0' <repeats 107 times>}, 
  xa_state = XA_ACTIVE, in_thd = true}
(gdb) p *(XID_STATE*)((HASH_LINK*)info->array.buffer + 4)->data
$14 = {xid = {formatID = 1, gtrid_length = 24, bqual_length = 0, 
    data = "MySQLXid\001\000\000\000\000\000\000\000I\024K", '\0' <repeats 108 times>}, xa_state = XA_ACTIVE, in_thd = true}
(gdb) p *(XID_STATE*)((HASH_LINK*)info->array.buffer + 5)->data
$15 = {xid = {formatID = 39187, gtrid_length = 19, bqual_length = 1, data = "16312069782217622371", '\0' <repeats 107 times>}, 
  xa_state = XA_PREPARED, in_thd = true}
(gdb) p *(XID_STATE*)((HASH_LINK*)info->array.buffer + 6)->data
$16 = {xid = {formatID = 39187, gtrid_length = 19, bqual_length = 1, data = "16412069781914165620", '\0' <repeats 107 times>}, 
  xa_state = XA_ACTIVE, in_thd = true}

The content of the hash table is:
(gdb) x/40 info->array.buffer
0xc9ba30:       0xffffffff      0x00000000      0x123d7098      0x00002aac
0xc9ba40:       0xffffffff      0x00000000      0x0d905008      0x00002aac
0xc9ba50:       0x00000001      0x00000000      0x11c20058      0x00002aac
0xc9ba60:       0xffffffff      0x00000000      0x171a3e78      0x00002aac
0xc9ba70:       0xffffffff      0x00000000      0x11e8cab8      0x00002aac
0xc9ba80:       0x00000000      0x00000000      0x16b72718      0x00002aac
0xc9ba90:       0xffffffff      0x00000000      0x11c33238      0x00002aac

We have checked each item. It seems that most items are in their right places. However, item 6 (16412069781914165620) should be  in bucket 3. 

The xid to be inserted is:
$35 = {xid = {formatID = 39187, gtrid_length = 19, bqual_length = 1, data = "16212069782219078521", '\0' <repeats 107 times>}, 
  xa_state = XA_ACTIVE, in_thd = true}
[10 May 2007 10:56] Valeriy Kravchuk
Bug #27871 was marked as a duplicate of this one.
[11 May 2007 8:25] Tianlei HU
I am a colleague of Yuan WANG, and we are debugging mysqld these days to find the problem out. I will give more details about our findings.

The main point we found is that the hash key, i.e. the 'data' member in the XID struct, of the item with xid {formatID = 1, gtrid_length = 24, bqual_length = 0, data = "MySQLXid\001\000\000\000\000\000\000\000I\024K", '\0' <repeats 108 times>} is modified after it is inserted into the xid_cache hash. The hash item is in a wrong hash bucket after the modify of the hash key, and it causes the damage of the structure of the xid_cache hash table. When another hash key is to be inserted into the same bucket of this item, the core dump is generated since the 'movelink' function cannot find this item in the hash bucket link list. 

We checked the source code for the caller of xid_cache_insert, and found this two:
1. sql_parse.cc(4837), to create a xid when a XA START command arrived. The xid is specified by the user, not the auto generated xid with prefix "MySQLXid". 
2. handler.cc(1023), in function ha_recover(), it is called when crash recovery, and is unrelated to this problem.
These convince us that a xid with prefix 'MySQLXid' cannot be inserted into the xid_cache in normal way.

Now, the problem is how a xid inserted into the xid_cache is modified to the 'MySQLXid' prefixed id.
[11 May 2007 8:41] Tianlei HU
The 'MySQLXid' prefix is defined at handler.h:221 as MYSQL_XID_PREFIX, and this macro is used at handler.h:253, in function 'void set(ulonglong xid)' of 'struct xid_t', i.e. 'XID' type.

We have checked all codes, and only finds one location calling xid.set(ulonglong): handler.cc:611 in function 'trans_register_ha'. When the xid.is_null() == TRUE, the function will modify the xid with a prefix 'MySQLXid'.

Since 'trans_register_ha()' is called before every transaction and statement, if there is a stmt/trans which null() the thd.transaction.xid_state.xid, and do not delete it from the xid_cache, the execution of the next stmt/trans in the same thread will modify the xid and lead to the damage of the xid_cache hash table.

Up to now, we do not find such source code path which null the xid, and then do not delete it from the xid_cache, so we cannot work this problem out.
[11 May 2007 20:11] Sergei Golubchik
Your analysys above in absolutely correct.

Here's an idea that may help you to find when the hash is corrupted: in mysys/hash.c there's hash_check() function which presumably checks the hash for consistency. It's only enabled in DBUG builds though. You can insert calls to it in other hash functions, e.g. hash_insert and hash_delete. Or directly in the xid handling code - xid_cache_* functions  (to only call hash_check() for xid_cache). Hopefully it will help to discover the hash corruption earlier, before movelink hits the corrupted bucket.
[22 May 2007 11:21] Tianlei HU
We finally found the sql series to null() the xid, which causes the damage of xid cache. There are two ways, one is using "SET AUTOCOMMIT = TRUE", and the other one is a deadlock, to cause the core dump.

Scenario 1 (Auto Commit):
   Thread 1:
     mysql> drop table test.test; 
     mysql> create table test.test (i int primary key) engine = innodb;
     mysql> set autocommit = false;
     mysql> xa start 0x01,0x01;  
     mysql> select count(1) from test.test;         
     mysql> set autocommit = true;         
     mysql> select count(1) from test.test;

    After the execution, the hash key of the current xid is altered, since the set_option_autocommit() when set autocommit to true will call ha_commit(thd) to commit the whole transaction. Xid of current thread(thd->transaction.xid_state.xid) is null()ed by the ha_commit() function, and the next SQL will set the thd->transaction.xid_state.xid in trans_register_ha(), which damage the xid_cache. 
    
Scenario 2 (Deadlock):
   Thread 1:
     mysql> drop table test.test; 
     mysql> create table test.test (i int primary key) engine = innodb;
     mysql> insert into test.test values(1);
     mysql> insert into test.test values(2);
     mysql> xa start 0x01,0x01;  
     mysql> update test.test set i=3 where i=1;
   Thread 2:
     mysql> xa start 0x02,0x01;
     mysql> update test.test set i=4 where i=2;
   Thread 1:
     mysql> update test.test set i=6 where i=2;
     ... wait for lock on test.test.i=2
   Thread 2:
     mysql> update test.test set i=5 where i=1;
     ... deadlock detected
     mysql> select count(1) from test.test;

    When deadlock detected, the transaction is implicitely rollbacked in convert_error_code_to_mysql() in ha_innodb.cc, which null() the xid_state. However, if the client send another query to the mysqld, the xid_cache will by damaged by setting thd->transaction.xid_state.xid in trans_register_ha().
    The processing of "lock wait time out" and "lock table full" errors in convert_error_code_to_mysql() could also leads to the damagement of xid_cache.

    After these two scenarios happen, the xid_cache is in a errornous status as shown in previous comments; following modifications(either a insert or a delete) on the xid_cache will cause core dump.
[11 Jun 2007 2:22] Yuan WANG
We have modified our application so that it will not issue any operations other than commit/rollback after deadlocks or lock wait timeouts. Our system has ran smoothly for a week now after that. This should confirm that issuing normal SQL operations after deadlocks or lock wait timeouts is the only reason of the crash. So I think the fix should be easy.
[25 Jun 2007 12:58] Valeriy Kravchuk
I was not able to repeat your scenarios with latest 5.0.46-debug on Linux. No crashes. Please, check newer versions, 5.0.41/5.0.42.
[26 Jun 2007 12:52] Yuan WANG
The deadlock test case will not crash the server in todays BitKeeper codes. However, the real problem is the misuse of the xid cache hashtable. This misuse may or may not lead to crash. The following scenario show that the xid cache is still wrong after deadlock.

mysql> create table t(a int, b int, c varchar(20)) engine = innodb;
mysql> insert into t values(1, 1, 'a');
mysql> insert into t values(2, 2, 'b');
Thread 1:
mysql> xa start 'a','b';
mysql> update t set c = 'aa' where a = 1;
Thread 2:
mysql> xa start 'a','c';
mysql> update t set c = 'bb' where a = 2;
Thread 1:
mysql> update t set c = 'bb' where a = 2;
Thread 2:
mysql> mysql> update t set c = 'aa' where a = 1;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
mysql> select count(*) from t;
mysql> xa end 'a','c';
ERROR 1397 (XAE04): XAER_NOTA: Unknown XID
mysql> xa start 'a','c';
ERROR 1399 (XAE07): XAER_RMFAIL: The command cannot be executed when global transaction is in the  ACTIVE state

After deadlock and another SQL, we can not find the XID in the xid cache hashtable, however the current thread is still in ACTIVE state. This means the content of xid cache hashtable is invalid. Through the trivial test above don't crash the server, I think more deadlock errors will crash it eventually.
[1 Feb 2008 18:44] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.51a, and inform about the results.
[2 Feb 2008 13:06] Yuan WANG
Are there any develop effort put into this bug? I tried today's trunk (5.0.56-debug-log), it seems that the problem remains the same. You can reproduce with the following operations.

mysql> create table t(a int, b int, c varchar(20), primary key(a)) engine = innodb;
mysql> insert into t values(1, 1, 'a');
mysql> insert into t values(2, 2, 'b');
Thread 1:
mysql> xa start 'a','b';
mysql> update t set c = 'aa' where a = 1;
Thread 2:
mysql> xa start 'a','c';
mysql> update t set c = 'bb' where a = 2;
Thread 1:
mysql> update t set c = 'bb' where a = 2;
Thread 2:
mysql> update t set c = 'aa' where a = 1;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
mysql> select count(*) from t;
mysql> xa end 'a','c';
ERROR 1397 (XAE04): XAER_NOTA: Unknown XID
mysql> xa start 'a','c';
ERROR 1399 (XAE07): XAER_RMFAIL: The command cannot be executed when global transaction is
in the  ACTIVE state

You can see that the content of Xid cache has been damaged. 

BTW: I think it should be quite easy to fix this problem, just to prevent from executing anything other than XA operations after deadlocks. And changing the value of autocommit to true should also be disabled when the connection is in a XA transaction. I just don't quite understand why this serious problem (and should be easy to fix) remains unfixed for more than half a year.
[13 Feb 2008 14:08] Heikki Tuuri
This is probably a bug in MySQL code, not InnoDB.
[13 Feb 2008 15:43] Susanne Ebrecht
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

mysql> select version()\G
*************************** 1. row ***************************
version(): 5.0.58-debug

I made the tests as described with the following result:

mysql> update t set c='bb' where a=2;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> update t set c='aa' where a=1;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

mysql> xa end 'a','c';
Query OK, 0 rows affected (0.00 sec)

mysql> xa start 'a','c';
ERROR 1399 (XAE07): XAER_RMFAIL: The command cannot be executed when global transaction is in the  IDLE state

Because the transaction is in IDLE and not in ACTIVE state, I would say, the reported bug is fixed.
[20 Jul 2008 2:11] Yuan WANG
Hi, Susanne Ebrecht:

I think this bug hasn't been fixed. In your test, you forgot to execute "select count(*) from t;
" after the deadlock. After deadlock, you should execute another SQL to trigger this bug. This is vital, if you don't execute any SQL other than 'xa end' after deadlock, everything will be OK.

So the bug remains and should be fixed. I think a simple workaround is to disable any SQL other than 'xa end' after deadlock in XA transactions.
[20 Jul 2008 22:54] MySQL Verification Team
Bug: http://bugs.mysql.com/bug.php?id=38245 has been marked as duplicate of this one.
[20 Jul 2008 23:53] MySQL Verification Team
Exiting the thread 2 client and re-login again and try the xa start the server
crash as below:

c:\dbs>c:\dbs\5.1\bin\mysql -uroot --port=3351 --prompt="mysql 5.1 >"
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.1.28-nt-debug-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql 5.1 >use test
Database changed
mysql 5.1 >xa start 'a','c';
Query OK, 0 rows affected (0.00 sec)

mysql 5.1 > update t set c = 'bb' where a = 2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql 5.1 >update t set c = 'aa' where a = 1;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
mysql 5.1 >select count(*) from t;
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)

mysql 5.1 >xa end 'a','c';
ERROR 1397 (XAE04): XAER_NOTA: Unknown XID
mysql 5.1 >xa start 'a','c';
ERROR 1399 (XAE07): XAER_RMFAIL: The command cannot be executed when global transaction is in the  ACTIVE state
mysql 5.1 >exit
Bye

c:\dbs>51c

c:\dbs>c:\dbs\5.1\bin\mysql -uroot --port=3351 --prompt="mysql 5.1 >"
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.1.28-nt-debug-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql 5.1 >use test
Database changed
mysql 5.1 >xa start 'a','c';
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql 5.1 >

c:\dbs>c:\dbs\5.1\bin\mysqld --defaults-file=c:\dbs\5.1\my.ini --standalone --console
080720 20:32:48  InnoDB: Started; log sequence number 0 168417862
080720 20:32:48 [Note] Event Scheduler: Loaded 0 events
080720 20:32:48 [Note] c:\dbs\5.1\bin\mysqld: ready for connections.
Version: '5.1.28-nt-debug-log'  socket: ''  port: 3351  Source distribution
Assertion failed: hash_search(&xid_cache, xid_state->xid.key(), xid_state->xid.key_length())==0, file .\sql_class.cc, line 3073
080720 20:41:55 - mysqld got exception 0x80000003 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388572
read_buffer_size=131072
max_used_connections=2
max_threads=151
threads_connected=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 338102 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x41d8158
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
00000001401CAB35    mysqld.exe!my_sigabrt_handler()[mysqld.cc:1984]
0000000140767481    mysqld.exe!raise()[winsig.c:597]
00000001407908B3    mysqld.exe!__crtMessageWindowA()[dbgrpt.c:434]
00000001407683EC    mysqld.exe!_VCrtDbgReportA()[dbgrptt.c:420]
00000001407902F3    mysqld.exe!_CrtDbgReportV()[dbgrpt.c:301]
000000014079028D    mysqld.exe!_CrtDbgReport()[dbgrpt.c:317]
000000014078127B    mysqld.exe!_NMSG_WRITE()[crt0msg.c:197]
000000014075AAA7    mysqld.exe!abort()[abort.c:68]
00000001407531C4    mysqld.exe!_wassert()[assert.c:212]
00000001402D3150    mysqld.exe!xid_cache_insert()[sql_class.cc:3073]
00000001403205C7    mysqld.exe!mysql_execute_command()[sql_parse.cc:4447]
0000000140324F7D    mysqld.exe!mysql_parse()[sql_parse.cc:5654]
000000014031682A    mysqld.exe!dispatch_command()[sql_parse.cc:1139]
0000000140315C1A    mysqld.exe!do_command()[sql_parse.cc:794]
000000014046F3F6    mysqld.exe!handle_one_connection()[sql_connect.cc:1115]
000000014056F395    mysqld.exe!pthread_start()[my_winthread.c:85]
0000000140769C15    mysqld.exe!_callthreadstart()[thread.c:295]
0000000140769BE7    mysqld.exe!_threadstart()[thread.c:277]
000000007755CDCD    kernel32.dll!BaseThreadInitThunk()
00000000776AC6E1    ntdll.dll!RtlUserThreadStart()
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0000000004243388=xa start 'a','c'
thd->thread_id=4
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
[23 Jul 2008 2:34] Yuan WANG
It's strange that this bug doesn't show in the "official" 5.1 bug list
http://bugs.mysql.com/search.php?search_for=&status=Active&severity=&limit=All&order_by=pr...

Maybe the version should be changed from "5.0.38/5.1" to "5.0.38, 5.1"?
[29 Jul 2008 13:01] Yuan WANG
I think the search system of Bugzilla is broken. I can't find this bug with the following search condition:
  Search for: xid
  Status: open
  Version: 5.1

But with the following search condition, I can find it:
  Search for: xid
  Status: open
  Version: 5.0

Or I can find it with this:
  Search for: xid
  Status: open
  Version: 5.0.38, 5.1

With this broken search function of Bugzilla, there could be a lot of bugs of 5.1 will not show in the some called "official" 5.1 bugs list, and could possibly be ignored.
[30 Jul 2008 17:06] MySQL Verification Team
Thank you for the bug report.

Server version: 5.0.68-nt-debug-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql 5.0 > use test
Database changed
mysql 5.0 > xa start 'a','c';
Query OK, 0 rows affected (0.00 sec)

mysql 5.0 > update t set c = 'bb' where a = 2;
Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql 5.0 > update t set c = 'aa' where a = 1;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
mysql 5.0 > select count(*) from t;
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.05 sec)

mysql 5.0 > xa end 'a','c';
ERROR 1397 (XAE04): XAER_NOTA: Unknown XID
mysql 5.0 >  xa start 'a','c';
ERROR 1399 (XAE07): XAER_RMFAIL: The command cannot be executed when global transaction is in the  ACTIVE state
mysql 5.0 >

c:\dbs>c:\dbs\5.0\bin\mysqld --defaults-file=c:\dbs\5.0\my.ini --standalone --console
080730 13:56:52  InnoDB: Started; log sequence number 0 81855
080730 13:56:52 [Note] c:\dbs\5.0\bin\mysqld: ready for connections.
Version: '5.0.68-nt-debug-log'  socket: ''  port: 3500  Source distribution
Assertion failed: hash_search(&xid_cache, xid_state->xid.key(), xid_state->xid.key_length())==0, file .\sql_class.cc, line 2357

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
[15 Aug 2008 10:03] MySQL Verification Team
i got this valgrind warnings on 5.0.66a:

 Thread 23:
 Invalid read of size 4
 0x83B6F3A: my_hash_insert (hash.c:270)
 0x8184173: xid_cache_insert(st_xid_state*) (sql_class.cc:2358)
 0x81A7510: mysql_execute_command(THD*) (sql_parse.cc:5054)
 0x81A8DA6: mysql_parse(THD*, char const*, unsigned, char const**) (sql_parse.cc:6178)
 0x81A0C89: dispatch_command(enum_server_command, THD*, char*, unsigned) (sql_parse.cc:1875)
 0x81AC3C3: do_command(THD*) (sql_parse.cc:1581)
 0x819FEC2: handle_one_connection (sql_parse.cc:1187)
 0x4893DA: start_thread (in /lib/libpthread-2.5.so)
 0x3D606D: clone (in /lib/libc-2.5.so)
  Address 0x4083490 is 8 bytes before a block of size 800 alloc'd
 0x4005400: malloc (vg_replace_malloc.c:149)
 0x83B1D31: my_malloc (my_malloc.c:34)
 0x83B7893: init_dynamic_array (array.c:59)
 0x83B7682: _hash_init (hash.c:58)
 0x8184009: xid_cache_init() (sql_class.cc:2310)
 0x818F3AA: init_server_components() (mysqld.cc:3413)
 0x818D229: main (mysqld.cc:3752)
[15 Aug 2008 10:08] MySQL Verification Team
bug #36642 is probably a duplicate of this
[15 Aug 2008 10:11] MySQL Verification Team
More valgrind errors from 5.0.66a:

1 errors in context 1 of 2:
Invalid write of size 4
 0x83B7284: hash_delete (hash.c:272)
 0x817FDC1: THD::cleanup() (sql_class.cc:2367)
 0x818C8A9: end_thread(THD*, bool) (mysqld.cc:1709)
 0x819FD94: handle_one_connection (sql_parse.cc:1216)
 0x4893DA: start_thread (in /lib/libpthread-2.5.so)
 0x3D606D: clone (in /lib/libc-2.5.so)
 Address 0x4083490 is 8 bytes before a block of size 800 alloc'd
 0x4005400: malloc (vg_replace_malloc.c:149)
 0x83B1D31: my_malloc (my_malloc.c:34)
 0x83B7893: init_dynamic_array (array.c:59)
 0x83B7682: _hash_init (hash.c:58)
 0x8184009: xid_cache_init() (sql_class.cc:2310)
 0x818F3AA: init_server_components() (mysqld.cc:3413)
 0x818D229: main (mysqld.cc:3752)

1 errors in context 2 of 2:
Invalid read of size 4
 0x83B727B: hash_delete (hash.c:270)
 0x817FDC1: THD::cleanup() (sql_class.cc:2367)
 0x818C8A9: end_thread(THD*, bool) (mysqld.cc:1709)
 0x819FD94: handle_one_connection (sql_parse.cc:1216)
 0x4893DA: start_thread (in /lib/libpthread-2.5.so)
 0x3D606D: clone (in /lib/libc-2.5.so)
 Address 0x4083490 is 8 bytes before a block of size 800 alloc'd
 0x4005400: malloc (vg_replace_malloc.c:149)
 0x83B1D31: my_malloc (my_malloc.c:34)
 0x83B7893: init_dynamic_array (array.c:59)
 0x83B7682: _hash_init (hash.c:58)
 0x8184009: xid_cache_init() (sql_class.cc:2310)
 0x818F3AA: init_server_components() (mysqld.cc:3413)
 0x818D229: main (mysqld.cc:3752)
[2 Sep 2008 16:46] Philip Stoev
Here is another stack trace:

#0  movelink (array=0x9e1a9f0, find=2, next_link=1013866853, newlink=4) at hash.c:272
#1  0x08664b37 in my_hash_insert (info=0x886b540, record=0xa3902c8 "╦,Д\t") at hash.c:437
#2  0x083d1e93 in mdl_acquire_shared_lock (context=0xa38df2c, lock_data=0x9e42cb8, retry=0xa80e4c8b) at mdl.cc:754
#3  0x082783ec in open_table_get_mdl_lock (thd=0xa38deb8, table_list=0xa80e53f8, mdl_lock_data=0x9e42cb8, flags=0, action=0xa80e5174) at sql_base.cc:2272
#4  0x0827956c in open_table (thd=0xa38deb8, table_list=0xa80e53f8, mem_root=0xa38f8b0, action=0xa80e5174, flags=0) at sql_base.cc:2530
#5  0x0827cf3d in open_ltable (thd=0xa38deb8, table_list=0xa80e53f8, lock_type=TL_WRITE_CONCURRENT_INSERT, lock_flags=27) at sql_base.cc:4012
#6  0x0827d1a4 in open_performance_schema_table (thd=0xa38deb8, one_table=0xa80e53f8, backup=0xa80e55f8) at sql_base.cc:7905
#7  0x082d5328 in Log_to_csv_event_handler::log_general (this=0x9e44948, thd=0xa38deb8, event_time=1220373711,
    user_host=0xa80e56bc "root[root] @ localhost []", user_host_len=25, thread_id=10, command_type=0x86f3899 "Connect", command_type_len=7,
    sql_text=0xa80e58e8 "root@localhost on test", sql_text_len=22, client_cs=0x885b2e0) at log.cc:391
#8  0x082d0a29 in LOGGER::general_log_write (this=0x8865300, thd=0xa38deb8, command=COM_CONNECT, query=0xa80e58e8 "root@localhost on test", query_length=22)
    at log.cc:1046
#9  0x082d0b2e in LOGGER::general_log_print (this=0x8865300, thd=0xa38deb8, command=COM_CONNECT, format=0x86f2e08 "%s@%s on %s",
    args=0xa80e5d2c "8L<\nR\200n\bБa\016╗\207W{\bЪЪЪ\037Д-o\b\\]\016╗\020") at log.cc:1065
#10 0x082d0b81 in general_log_print (thd=0xa38deb8, command=COM_CONNECT, format=0x86f2e08 "%s@%s on %s") at log.cc:4132
#11 0x0823403c in check_user (thd=0xa38deb8, command=COM_CONNECT, passwd=0xa40a276 "test", passwd_len=0, db=0x3c6e6165 <Address 0x3c6e6165 out of bounds>,
    check_count=true) at sql_connect.cc:439
#12 0x08234cd8 in check_connection (thd=0xa38deb8) at sql_connect.cc:925
#13 0x08234d8d in login_connection (thd=0xa38deb8) at sql_connect.cc:985
#14 0x0823509c in handle_one_connection (arg=0xa38deb8) at sql_connect.cc:1144
#15 0x0057d32f in start_thread () from /lib/libpthread.so.0
#16 0x0049a27e in clone () from /lib/libc.so.6

267       HASH_LINK *old_link;
268       do
269       {
270         old_link=array+next_link;
271       }
272       while ((next_link=old_link->next) != find); <<<<<<<<< HERE
273       old_link->next= newlink;
274       return;
275     }
276

(gdb) print old_link->next
Cannot access memory at address 0xed54b518
[14 Oct 2008 22:10] Davi Arnaut
--source include/have_innodb.inc

create table t(a int, b int, c varchar(20), primary key(a)) engine = innodb;
insert into t values(1, 1, 'a');
insert into t values(2, 2, 'b');

connect (con1,localhost,root,,);
connect (con2,localhost,root,,);

--connection con1
xa start 'a','b';
update t set c = 'aa' where a = 1;
--connection con2
xa start 'a','c';
update t set c = 'bb' where a = 2;
--connection con1
--send update t set c = 'bb' where a = 2
--connection con2
--sleep 1
--error 1213
update t set c = 'aa' where a = 1;
select count(*) from t;
--error 1397
xa end 'a','c';
--error 1399
xa start 'a','c';
--disconnect con2

connect (con3,localhost,root,,);
--connection con3
xa start 'a','c';
[16 Oct 2008 14:51] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56391

2707 Davi Arnaut	2008-10-16
      Bug#28323: Server crashed in xid cache operations
      
      The problem was that the server did not robustly handle a
      unilateral roll back issued by the Resource Manager (RM)
      due to a resource deadlock within the transaction branch.
      By not acknowledging the roll back, the server (TM) would
      eventually corrupt the XA transaction state and crash.
      
      The solution is to mark the transaction as rollback-only
      if the RM indicates that it rolled back its branch of the
      transaction.
[20 Oct 2008 18:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56615

2708 Davi Arnaut	2008-10-20
      Bug#28323: Server crashed in xid cache operations
      
      The problem was that the server did not robustly handle a
      unilateral roll back issued by the Resource Manager (RM)
      due to a resource deadlock within the transaction branch.
      By not acknowledging the roll back, the server (TM) would
      eventually corrupt the XA transaction state and crash.
      
      The solution is to mark the transaction as rollback-only
      if the RM indicates that it rolled back its branch of the
      transaction.
[21 Oct 2008 17:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56729

2708 Davi Arnaut	2008-10-21
      Bug#28323: Server crashed in xid cache operations
      
      The problem was that the server did not robustly handle a
      unilateral roll back issued by the Resource Manager (RM)
      due to a resource deadlock within the transaction branch.
      By not acknowledging the roll back, the server (TM) would
      eventually corrupt the XA transaction state and crash.
      
      The solution is to mark the transaction as rollback-only
      if the RM indicates that it rolled back its branch of the
      transaction.
[21 Oct 2008 17:48] Davi Arnaut
Queued to 5.0-bugteam
[21 Oct 2008 18:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56734

2704 Davi Arnaut	2008-10-21
      Bug#28323: Server crashed in xid cache operations
      
      The problem was that the server did not robustly handle a
      unilateral roll back issued by the Resource Manager (RM)
      due to a resource deadlock within the transaction branch.
      By not acknowledging the roll back, the server (TM) would
      eventually corrupt the XA transaction state and crash.
      
      The solution is to mark the transaction as rollback-only
      if the RM indicates that it rolled back its branch of the
      transaction.
[21 Oct 2008 20:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56752

2771 Davi Arnaut	2008-10-21 [merge]
      Merge Bug#28323 to mysql-5.1.29-rc
[22 Oct 2008 16:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56837

2877 Davi Arnaut	2008-10-22 [merge]
      Bug#28323: Server crashed in xid cache operations
      
      The problem was that the server did not robustly handle a
      unilateral roll back issued by the Resource Manager (RM)
      due to a resource deadlock within the transaction branch.
      By not acknowledging the roll back, the server (TM) would
      eventually corrupt the XA transaction state and crash.
      
      The solution is to mark the transaction as rollback-only
      if the RM indicates that it rolled back its branch of the
[24 Oct 2008 8:38] Bugs System
Pushed into 5.0.72  (revid:davi.arnaut@sun.com-20081021180731-fh8cdioim22rd2ju) (version source revid:davi.arnaut@sun.com-20081021180731-fh8cdioim22rd2ju) (pib:5)
[24 Oct 2008 8:42] Bugs System
Pushed into 5.0.72  (revid:davi.arnaut@sun.com-20081021174543-2d4vz1e2rx4z4pcy) (version source revid:vvaintroub@mysql.com-20081023200547-fg490fairlbnae8f) (pib:5)
[26 Oct 2008 0:50] Paul DuBois
Noted in 5.0.72 changelog.

XA transaction rollbacks could result in corrupted transaction states
and a server crash.

Setting report to NDI pending push into 5.1.x, 6.0.x.
[30 Oct 2008 16:11] MySQL Verification Team
bug #40437 was marked as a duplicate of this
[4 Nov 2008 9:20] Bugs System
Pushed into 5.1.30  (revid:davi.arnaut@sun.com-20081021202801-x4wpbu07ghpf2yhz) (version source revid:mats@sun.com-20081103175049-ulv24ke1m2lpi632) (pib:5)
[4 Nov 2008 16:32] Paul DuBois
Noted in 5.1.30 changelog.

Setting report to NDI pending push into 6.0.x.
[10 Nov 2008 10:54] Bugs System
Pushed into 6.0.8-alpha  (revid:davi.arnaut@sun.com-20081021174543-2d4vz1e2rx4z4pcy) (version source revid:davi.arnaut@sun.com-20081024153711-yhcswhfwz9z5b07g) (pib:5)
[10 Nov 2008 11:37] Bugs System
Pushed into 5.1.30  (revid:davi.arnaut@sun.com-20081021174543-2d4vz1e2rx4z4pcy) (version source revid:davi.arnaut@sun.com-20081024125059-98c0zv1xym4oruam) (pib:5)
[10 Nov 2008 17:57] Paul DuBois
Noted in 6.0.8 changelog.
[11 Nov 2008 16:36] Paul DuBois
6.0.9 changelog, not 6.0.8.
[21 Nov 2008 21:02] Bugs System
Pushed into 5.1.30-ndb-6.2.17  (revid:davi.arnaut@sun.com-20081021202801-x4wpbu07ghpf2yhz) (version source revid:tomas.ulin@sun.com-20081121161314-jhz3twf5xjt8stnh) (pib:5)
[21 Nov 2008 23:45] Bugs System
Pushed into 5.1.30-ndb-6.3.20  (revid:davi.arnaut@sun.com-20081021202801-x4wpbu07ghpf2yhz) (version source revid:tomas.ulin@sun.com-20081121210644-zas6e9wa5kuj7d6f) (pib:5)
[27 Nov 2008 14:52] Bugs System
Pushed into 5.1.30-ndb-6.4.0  (revid:davi.arnaut@sun.com-20081021202801-x4wpbu07ghpf2yhz) (version source revid:tomas.ulin@sun.com-20081126125835-5sohkzk2jjwpq1wp) (pib:5)
[9 Dec 2008 13:25] Karl Jorgensen
We have the same bug on 5.1.26 (32bit)
[9 Dec 2008 14:16] MySQL Verification Team
This bug was fixed from version 5.1.30 so upgrade 5.1.26 version.
[19 Jan 2009 11:33] Bugs System
Pushed into 5.1.31-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090119095303-uwwvxiibtr38djii) (version source revid:tomas.ulin@sun.com-20090108105244-8opp3i85jw0uj5ib) (merge vers: 5.1.31-ndb-6.2.17) (pib:6)
[19 Jan 2009 13:09] Bugs System
Pushed into 5.1.31-ndb-6.3.21 (revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (version source revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (merge vers: 5.1.31-ndb-6.3.21) (pib:6)
[19 Jan 2009 16:15] Bugs System
Pushed into 5.1.31-ndb-6.4.1 (revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (version source revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (merge vers: 5.1.31-ndb-6.4.1) (pib:6)