Bug #45672 Semisync repl: ActiveTranx:insert_tranx_node: transaction node allocation failed
Submitted: 23 Jun 2009 11:01 Modified: 12 Nov 2009 12:20
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.4 OS:Any
Assigned to: Zhenxing He
Triage: Triaged: D2 (Serious)

[23 Jun 2009 11:01] Philip Stoev
Description:
When executing queries against a master that has the semisync master module installed, queries return

ERROR 1180 (HY000): Got error 1 during COMMIT

and the error log says:

090623 13:53:46 [ERROR] ActiveTranx:insert_tranx_node: transaction node allocation failed for: (master-bin.000001, 347083)
090623 13:53:46 [ERROR] Error writing file '/build/bzr/azalea/mysql-test/var/log/master-bin' (errno: 0)

The problem is that the query is actually COMMITTED on the master, even though an error message is returned. This is a major no-no.

How to repeat:
1. Run

MTR_VERSION=1 perl mysql-test-run.pl \
--start-and-exit \
--mysqld=--plugin-dir=/build/bzr/azalea/plugin/semisync/.libs \
--mysqld=--plugin-load=rpl_semi_sync_master=libsemisync_master.so \
--mysqld=--rpl_semi_sync_master_enabled=1 \
rpl_alter

This is going to start a master and a slave (both are loaded with the semisync master module only), however I do not think the slave matters in this case, since it is not connected to the master via CHANGE MASTER.

2. Execute:

gentest-old.pl

From the mysql-test-extra-6.0 tree, mysql-test/gentest directory. Though I suspect that any other sufficiently large queries will cause the problem to show up.

Suggested fix:
Apart from whatever the original cause of this bug is, transactions should never both return an error *and* commit.
[28 Jun 2009 12:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/77401

2804 He Zhenxing	2009-06-28
      BUG#45672 Semisync repl: ActiveTranx:insert_tranx_node: transaction node allocation failed
      BUG#45673 Semisynch reports correct operation even if no slave is connected
      
      When semi-sync was enabled on master without any semi-sync slaves
      connected, it would still think that semi-sync status is ON and
      keep insert tranx node and finally result in tranx_node allocation
      error.
      
      This is fixed by not consider semi-sync master status as ON if
      no semi-sync slaves connected.
     @ plugin/semisync/semisync_master.cc
        do not consider semi-sync master status as ON if no semi-sync slaves connected
     @ plugin/semisync/semisync_slave_plugin.cc
        run slave in async mode if master disabled semi-sync
     @ sql/log.cc
        set error to 1 when flush binlog or run after_flush hooks fails
[28 Jun 2009 13:26] Philip Stoev
He Zhenxing, I am not sure if messages "please contact the developer" are appropriate. I think that ASSERT() is better in this situation.

Furthermore, can we fix the general issue where the COMMIT both returned an error and committed the transaction so that this never happens regardless of the underlying issue (or an ASSERT() is triggered)?
[28 Jun 2009 13:47] Zhenxing He
Hi Philip,

I use assert() also, but since assert() will have no effect for release build, so I also added the error message.

And the COMMIT issue is also fixed with this patch.
[2 Jul 2009 10:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/77754

2804 He Zhenxing	2009-07-02
      BUG#45672 Semisync repl: ActiveTranx:insert_tranx_node: transaction node allocation failed
      BUG#45673 Semisynch reports correct operation even if no slave is connected
      
      When semi-sync was enabled on master without any semi-sync slaves
      connected, it would still think that semi-sync status is ON and
      keep insert tranx node and finally result in tranx_node allocation
      error.
      
      This is fixed by not consider semi-sync master status as ON if
      no semi-sync slaves connected.
     @ plugin/semisync/semisync_master.cc
        do not consider semi-sync master status as ON if no semi-sync slaves connected
     @ plugin/semisync/semisync_slave_plugin.cc
        run slave in async mode if master disabled semi-sync
     @ sql/log.cc
        set error to 1 when flush binlog or run after_flush hooks fails
[7 Jul 2009 2:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/78070

2824 He Zhenxing	2009-07-07
      BUG#45672 Semisync repl: ActiveTranx:insert_tranx_node: transaction node allocation failed
      BUG#45673 Semisynch reports correct operation even if no slave is connected
      
      When semi-sync was enabled on master without any semi-sync slaves
      connected, it would still think that semi-sync status is ON and
      keep insert tranx node and finally result in tranx_node allocation
      error.
      
      This is fixed by not consider semi-sync master status as ON if
      no semi-sync slaves connected.
     @ plugin/semisync/semisync_master.cc
        do not consider semi-sync master status as ON if no semi-sync slaves connected
     @ plugin/semisync/semisync_slave_plugin.cc
        run slave in async mode if master disabled semi-sync
     @ sql/log.cc
        set error to 1 when flush binlog or run after_flush hooks fails
[23 Jul 2009 10:24] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090723102221-ps4uaphwbxzj8p0q) (version source revid:zhenxing.he@sun.com-20090707024417-efaow72lsf1f4rm8) (merge vers: 5.4.4-alpha) (pib:11)
[3 Aug 2009 18:42] Paul Dubois
No changelog entry needed. Does not appear in any released version.
[26 Sep 2009 4:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/84703

3108 He Zhenxing	2009-09-26
      Backporting WL#4398 WL#1720
      Backporting BUG#44058 BUG#42244 BUG#45672 BUG#45673
      Backporting BUG#45819 BUG#45973 BUG#39012
[27 Oct 2009 9:49] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091027094604-9p7kplu1vd2cvcju) (version source revid:zhenxing.he@sun.com-20091026140226-uhnqejkyqx1aeilc) (merge vers: 6.0.14-alpha) (pib:13)
[27 Oct 2009 23:18] Paul Dubois
Noted in 6.0.14 changelog.

With semisynchronous replication enabled, the master considered
semisynchronous status to be on even with no slaves connected.
[27 Oct 2009 23:19] Paul Dubois
Setting report to NDI pending push into 5.5.x.
[12 Nov 2009 8:18] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091110093229-0bh5hix780cyeicl) (version source revid:alik@sun.com-20091027095744-rf45u3x3q5d1f5y0) (merge vers: 5.5.0-beta) (pib:13)
[12 Nov 2009 12:20] Jon Stephens
Already documented in the 5.5.0 changelog; re-closed.
[18 Dec 2009 15:48] Paul Dubois
Removed 5.5.0 changelog entry. In 5.5, semisync replication first appears in 5.5.0, so this bug affects no 5.5.x releases.