Bug #46944 Internal prepared XA transaction XIDs are not removed if server_id changes
Submitted: 26 Aug 2009 19:16 Modified: 18 Dec 2009 13:12
Reporter: Harrison Fisk Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.0 OS:Any
Assigned to: Kristofer Pettersson CPU Architecture:Any
Tags: xa, xid

[26 Aug 2009 19:16] Harrison Fisk
Description:
When MySQL crashes (or a snapshot is taken which simulates a crash), then it is possible that internal XA transactions (used to sync the binary log and InnoDB) can be left in a PREPARED state, whereas they should be rolled back.  This is done when the server_id changes before the restart occurs.  

This can leave rows locked in InnoDB which will persist across a restart.

The most common time this occurs is when you take a snapshot to prepare another slave.  It could also occur on a normal system after a crash if the server_id is changed for some reason.

See the following URL for more details:

http://harrison-fisk.blogspot.com/2009/01/xa-and-persistent-innodb-locks.html

You can then do an XA RECOVER and see the internal XID showing up and then roll it back manually.

How to repeat:
1.  Run a lot of transactions very quickly with the binary log enabled.
2.  Take a snapshot of the system.
3.  Change the server_id and restart.
4.  Notice the prepared transactions still present.

Suggested fix:
The internal XID is generated by combining the prefix MySQLXid + server_id + query_id.

During startup, ha_recovery() is called, which loops through the prepared XIDs and uses xid_t::get_my_xid() to verify that they were created by internal MySQL processing. get_my_xid() uses the prefix and the server_id to see if they do indeed belong to the server or not.

So if a snapshot is taken and the server_id is changed on restart, then it will not think it is the owner of the XID and will leave it in the prepared state.

I think it should just be enough to use the special prefix and to not use the server_id as well.  I believe the only drawback would be it would prevent people from using "MySQLXid" as a prefix in manually created XID values, which could be documented.

A diff to do this is:

=== modified file 'sql/handler.h'
--- sql/handler.h	2008-11-25 06:22:02 +0000
+++ sql/handler.h	2009-08-26 19:13:14 +0000
@@ -275,7 +275,6 @@
   my_xid get_my_xid()
   {
     return gtrid_length == MYSQL_XID_GTRID_LEN && bqual_length == 0 &&
-           !memcmp(data+MYSQL_XID_PREFIX_LEN, &server_id, sizeof(server_id)) &&
            !memcmp(data, MYSQL_XID_PREFIX, MYSQL_XID_PREFIX_LEN) ?
            quick_get_my_xid() : 0;
   }
[12 Oct 2009 12:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/86571

3119 Kristofer Pettersson	2009-10-12
      Bug#46944 Internal prepared XA transction XIDs are not
                removed if server_id changes
      
      When MySQL crashes (or a snapshot is taken which simulates
      a crash), then it is possible that internal XA
      transactions (used to sync the binary log and InnoDB)
      can be left in a PREPARED state, whereas they should be
      rolled back.  This is done when the server_id changes
      before the restart occurs.  
      
      This patch releases he restriction that the server_id
      should be consistent if the XID is to be considerred
      valid. The rollback phase should then be able to
      clean up all pending XA transactions.
[4 Nov 2009 9:25] Bugs System
Pushed into 5.1.41 (revid:joro@sun.com-20091104092152-qz96bzlf2o1japwc) (version source revid:kristofer.pettersson@sun.com-20091019090224-sxcpk82z9akeppxh) (merge vers: 5.1.41) (pib:13)
[11 Nov 2009 6:50] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091110093407-rw5g8dys2baqkt67) (version source revid:alik@sun.com-20091109080109-7dxapd5y5pxlu08w) (merge vers: 6.0.14-alpha) (pib:13)
[11 Nov 2009 6:58] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091109115615-nuohp02h8mdrz8m2) (version source revid:alik@sun.com-20091105090203-cls5j6k3ohu04xpt) (merge vers: 5.5.0-beta) (pib:13)
[17 Nov 2009 16:41] Paul DuBois
Noted in 5.1.41, 5.5.0, 6.0.14 changelogs.

When MySQL crashed (or a snapshot was taken that simulates a crash),
it was possible that internal XA transactions (used to synchronize
the binary log and InnoDB) could be left in a PREPARED state, whereas
they should be rolled back. This occurred when the server_id value
changed before the restart, because that value was used to construct
XID values. 

Now the restriction is relaxed that the server_id value be consistent
for XID values to be considered valid. The rollback phase should then
be able to clean up all pending XA transactions.
[18 Dec 2009 10:34] Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6) (version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers: 5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 10:49] Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt) (version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers: 5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 11:04] Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0) (version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers: 5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 11:19] Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606) (version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers: 5.1.41-ndb-7.0.11) (pib:15)
[18 Dec 2009 13:12] MC Brown
Already documented in 5.1.41