Bug #66411 | MySQL crashed in is_tranx_end_pos within an assertion for semisync plugin | ||
---|---|---|---|
Submitted: | 16 Aug 2012 1:59 | Modified: | 7 Feb 2014 4:47 |
Reporter: | liu hickey (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.5.28, 5.7.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | assertion failed, is_tranx_end_pos, semisync |
[16 Aug 2012 1:59]
liu hickey
[17 Aug 2012 17:17]
Sveta Smirnova
Thank you for the report. I can not repeat crash. Do you use built-in semisync plugin? Which MySQL package (file name you downloaded) do you use?
[18 Aug 2012 4:50]
liu hickey
Hi Sveta, maybe you forgot to add the DBUG_SYNC(xxx) in semisync_master.cc, which makes the low frequent hit race condition happen every time with the provided test case. Please add the DBUG_SYNC in plugin/semisync/semisync_master.cc mentioned before: + DEBUG_SYNC(current_thd, "rpl_semisync_master_commit_trx_before_lock"); And try it again. You can not miss it:) Here is my build cmd: $cat make.sh CFLAGS="-O0 -g" CXX=gcc CXXFLAGS="-O0 -g -felide-constructors -fno-exceptions -fno-rtti" cmake -DWITH_INNOBASE_STORAGE_ENGINE=1 -DCMAKE_INSTALL_PREFIX=/u01/mysqld -DWITH_EXTRA_CHARSETS:STRING=all -DDEFAULT_CHARSET=gbk -DDEFAULT_COLLATION=gbk_chinese_ci -DWITH_DEBUG=1 And the semisync is dynamic installed, as described in test suite: --source include/have_semisync_plugin.inc INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so'; This race risk should be existed in all versions, including the latest released version I tested(MySQL-5.5.27.tar.gz). Any extra info, please let me know.
[18 Aug 2012 13:05]
Sveta Smirnova
Thank you for the feedback. Verified as described.
[18 Dec 2012 12:22]
Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release. If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at http://dev.mysql.com/doc/en/installing-source.html
[18 Dec 2012 12:23]
Jon Stephens
Fixed in trunk. Documented as follows in the 5.7.1 changelog: When the server starts, it checks whether semisynchronous replication has been enabled without a lock, and, if so, it takes the lock, then tests again. Disabling semisynchronous replication following the first of the these tests, but prior to the second one, could lead to a crash of the server. Closed.
[28 Nov 2013 9:49]
MySQL Verification Team
wasn't this fixed in 5.6?
[3 Feb 2014 13:38]
Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release. Fixed in 5.6+. Documented in the 5.6.17 and 5.7.4 changelogs as follows: The server checks to determine whether semisynchronous replication has been enabled without a lock, and if this is the case, it takes the lock and checks again. If semisynchronous replication was disabled after the first but prior to the second one, this could cause the server to fail. Closed. If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at http://dev.mysql.com/doc/en/installing-source.html
[3 Feb 2014 13:41]
Jon Stephens
Disregard the reference to 5.7.4 in my previous comment. The fix for this bug is in 5.6.17 and 5.7.1. Thanks.
[7 Feb 2014 4:47]
liu hickey
Yes, 5.7.1 has fixed this issue verified by code review, and 5.6.17 has not yet released.
[28 Mar 2014 19:27]
Laurynas Biveinis
5.6$ bzr log -r 5788 -n0 ------------------------------------------------------------ revno: 5788 committer: Astha Pareek <astha.pareek@oracle.com> branch nick: mysql-5.6-b17920923 timestamp: Mon 2014-02-03 12:07:06 +0530 message: BUG#17920923 - BACKPORT PATCH FOR BUG#14511533 INTO 5.6 Problem: Server crash was observed when code first checked if semisync was enabled without lock, if so it takes the lock and checks again. If semisync gets disabled in-between the first and second check, an assert incorrectly referenced a null pointer for active transaction which leads to the crash. Solution: The assert is relocated onto a position where active_tranxs buffer is valid.