Bug #26395 | if crash during autocommit update to transactional table on master, slave fails | ||
---|---|---|---|
Submitted: | 15 Feb 2007 8:37 | Modified: | 6 Feb 2008 21:58 |
Reporter: | Guilhem Bichot | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
Version: | 5.0-bk,5.1-bk | OS: | Linux (linux) |
Assigned to: | Sven Sandberg | CPU Architecture: | Any |
[15 Feb 2007 8:37]
Guilhem Bichot
[23 May 2007 17:35]
Chuck Bell
I examined this bug using version 5.1.19. When running the test outlined with the planned crash, I encountered a different outcome. It seems the master does indeed complete the insert, but it is the slave that does not. When I examined the binlogs after the crash, the master's binlog had the insert and a commit statement in it, but the slave's binlog did not. Even when the slave was restarted, it did not run the insert. My suspicion is the planned crash is in the wrong place to duplicate the original bug. It is recommended that this be determined and corrected if necessary. I have run the test on both Windows and Linux and confirmed the behavior described above to be consistent across both platforms.
[25 May 2007 12:50]
Guilhem Bichot
Ok, there are two problems: 1) in 5.1, some pieces of the logic of half_binlogged_transaction were removed; until it is restored (I will talk with Mats), my testcase cannot work in 5.1. 2) I tested with 5.0 today and was able to repeat the problem only after commenting out a line of code. An autocommit statement is stored in the binlog as the statement (here, INSERT) and an Xid_log_event. The logic of half_binlogged_transaction is that it simulates a crash in the middle of writing the statement (the 100 last bytes of the statement, and the Xid_log_event, are intentionally not written to the binlog); so the slave cannot even read the INSERT (it's corrupted in the binlog). To test the present bug, the INSERT has to be readable by slave and so the 100 last bytes must not be chopped (only the Xid_log_event must be). ===== log.cc 1.210 vs edited ===== *** /tmp/bk_log.cc-1.210_Tr5jin 2007-04-27 15:33:46 +02:00 --- edited/log.cc 2007-05-25 14:35:37 +02:00 *************** *** 1868,1874 **** if (reinit_io_cache(cache, READ_CACHE, 0, 0, 0)) goto err; length=my_b_bytes_in_cache(cache); ! DBUG_EXECUTE_IF("half_binlogged_transaction", length-=100;); do { /* Write data to the binary log file */ --- 1868,1874 ---- if (reinit_io_cache(cache, READ_CACHE, 0, 0, 0)) goto err; length=my_b_bytes_in_cache(cache); ! // DBUG_EXECUTE_IF("half_binlogged_transaction", length-=100;); do { /* Write data to the binary log file */ After that patch, the logic of half_binlogged_transaction is that it will write the transaction to binlog except the commit event (which is Xid_log_event in my testcase); in detail, it does all that the normal code does, except that it skips over writing the commit event to binlog, and then it aborts the mysqld. This is to imitate what could happen in real life, where the mysqld would crash between writing INSERT to binlog and writing Xid_log_event to binlog. With that, the master crashes in INSERT (so this INSERT will rollback), and in the binlog there is INSERT but no Xid_log_event. Slave however picks INSERT, and so executes it. So slave and master differ.
[29 Aug 2007 16:27]
Chuck Bell
The following diff can be used to reproduce the problem in the mysql-5.1-new-rpl clone. ===== log.cc 1.294 vs edited ===== --- 1.294/sql/log.cc 2007-08-29 12:25:51 -04:00 +++ edited/log.cc 2007-08-29 12:22:26 -04:00 @@ -3987,8 +3987,12 @@ if ((write_error= write_cache(cache, false, false))) goto err; + DBUG_EXECUTE_IF("half_binlogged_transaction", goto DBUG_skip_commit;); if (commit_event && commit_event->write(&log_file)) goto err; +#ifndef DBUG_OFF +DBUG_skip_commit: +#endif if (flush_and_sync()) goto err; DBUG_EXECUTE_IF("half_binlogged_transaction", abort(););
[30 Aug 2007 19:36]
Chuck Bell
There are 2 patches. One for 5.0 that fixes the problem and one for 5.1 that corrects a set of test results and adds the debug code to detect the problem. See: 5.0 - http://lists.mysql.com/commits/33470 5.1 - http://lists.mysql.com/commits/33243
[25 Oct 2007 10:26]
Mats Kindahl
Setting to "in progress" since there is no new patch since my last comments.
[29 Nov 2007 13:56]
Lars Thalmann
REFINED PROBLEM DESCRIPTION (RPD) ================================= An INSERT statement executed on an XA storage engine in auto-commit mode is executed like this: 1. Engine prepare insert operation 2. INSERT statement is logged in binlog 3. XID is logged in binlog 4. Insert operation is committed in storage engine If there is master crash between 2 and 3, then the master will rollback the insert. But, the slave will receive INSERT in binary log and execute it. CLARIFICATION ABOUT SEMANTICS ============================= The semantics of the logged INSERT statement shall be that it is to be committed if only if there is an XID event. A logged INSERT statement does *not* mean that it has been executed on the master when XA is being used. If XA is *not* being used and autocommit is on, then the logged INSERT statement *does* mean that it has been executed on the master. VISION ====== A) In the future we want each of these event categories to be clearly expressed in the binary log: 1. Transaction start 2. Statement start (this includes any event that is used as "seed" for coming events) 3. Inside statement 4. Statement end 5. Transaction commit/autocommitted statement 6. Transaction rollback Probably need 2 start bits: 00. No start 01. Statement start 11. Transaction start Probably need 2 end bits: 00. No end 01. Transaction- or autocommit 10. Rollback 11. Statement end B) We probably need to stop to use transactional cache in the future. Reasons: 1) We have both transactional and non-transaction tables, and if a dependent non-transactional update happens in the middle of a (rolled-back) transaction, it is impossible to get it correctly logged if we only log this particular statement. So, we need to log the whole transaction and add a rollback at end, *if* there is a single dependent non-transactional table update in it. (Note that in row-based logging, a logged "statement" is never dependent on an easlier one.) 2) For long transactions, the transaction cache can be quite filled. We would probably like to have interleaving in the future, i.e. not have to wait until the whole transaction is executed before we start logging it. POSSIBLE SOLUTIONS ================== 1. Always use autocommit off on slave. Negatives: - Rollbacks will get wider scope on slave than master. 2. Add BEGIN whenever there is a XA table involved in an autocommitted statement (Chuck's patch). Negatives: - An extra event in the binlog. 3. Make sure INSERT is stored in transaction cache and do not flush transaction cache until the transaction is committed (step 3 in RPD above). Negatives: - Not in line with future vision (B). DECISION ======== Use solution 2 for now and add the bits for transaction/start end in later version (WL#4162). Sven to do ---------- 1. Write test case to show that the bug happens 2. Write test case to show that non-xa engines are not affected by the patch (e.g. NDB) 3. Review Chuck's patch and get other reviews
[13 Dec 2007 18:55]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/39927 ChangeSet@1.2651, 2007-12-13 19:56:17+01:00, sven@riska.(none) +21 -0 BUG#26395: if crash during autocommit update to transactional table on master, slave fails Now, every transaction (including autocommit transactions) start with a BEGIN and end with a COMMIT/ROLLBACK in the binlog. Added a test case, and updated lots of test case result files.
[14 Dec 2007 13:39]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/39990 ChangeSet@1.2654, 2007-12-14 14:40:45+01:00, sven@riska.(none) +21 -0 BUG#26395: if crash during autocommit update to transactional table on master, slave fails Now, every transaction (including autocommit transactions) starts with a BEGIN and ends with a COMMIT/ROLLBACK in the binlog. Added a test case, and updated lots of test case result files.
[18 Dec 2007 16:09]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/40141 ChangeSet@1.2558, 2007-12-18 17:08:28+01:00, sven@riska.(none) +11 -0 BUG#26395: if crash during autocommit update to transactional table on master, slave fails Now, every transaction (including autocommit transactions) start with a BEGIN and end with a COMMIT/ROLLBACK in the binlog. Added a test case, and updated lots of test case result files.
[21 Dec 2007 19:30]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/40354 ChangeSet@1.2560, 2007-12-21 20:30:23+01:00, sven@riska.(none) +10 -0 BUG#26395: if crash during autocommit update to transactional table on master, slave fails Now, every transaction (including autocommit transactions) start with a BEGIN and end with a COMMIT/ROLLBACK in the binlog. Added a test case, and updated lots of test case result files.
[5 Feb 2008 12:46]
Bugs System
Pushed into 5.0.56
[5 Feb 2008 13:04]
Bugs System
Pushed into 5.1.24-rc
[5 Feb 2008 13:08]
Bugs System
Pushed into 6.0.5-alpha
[6 Feb 2008 21:58]
Jon Stephens
Bugfix documented in the 5.0.56, 5.1.24, and 6.0.5 changelogs as follows: When the master crashed during an update on a transactional table while in AUTOCOMMIT mode, the slave failed. This fix causes every transaction (including AUTOCOMMIT transactions) to be recorded in the binlog as starting with a BEGIN and ending with a COMMIT or ROLLBACK.
[6 Mar 2008 5:41]
Jon Stephens
Also documented for 5.1.23-ndb-6.2.14.
[30 Mar 2008 0:27]
Jon Stephens
Also documented for 5.1.23-ndb-6.3.11.