Bug #76795 2pc succeeds even though binlog flush/sync fails
Submitted: 22 Apr 2015 19:39 Modified: 28 May 2015 17:53
Reporter: Santosh Praneeth Banda Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.6.23 OS:Any
Assigned to: CPU Architecture:Any

[22 Apr 2015 19:39] Santosh Praneeth Banda
Description:
Transaction get committed to InnoDB even though binlog flush/sync fails, i.e; transaction lost in binary logs thus slaves have no understanding of this data change leading to inconsistencies

How to repeat:
Use this patch. Please note that similar reproduction steps apply at sync_binlog_file()

--- a/sql/binlog.cc
+++ b/sql/binlog.cc
@@ -7206,7 +7206,8 @@ MYSQL_BIN_LOG::change_stage(THD *thd,
 int
 MYSQL_BIN_LOG::flush_cache_to_file(my_off_t *end_pos_var)
 {
-  if (flush_io_cache(&log_file))
+  if (DBUG_EVALUATE_IF("simulate_binlog_flush_error", 1,
+      flush_io_cache(&log_file)))
     return ER_ERROR_ON_WRITE;
   *end_pos_var= my_b_tell(&log_file);
   return 0;

Start server using mtr --start --mysqld=--log_bin=master-bin and execute the following

mysql> use test;
Database changed
mysql> create table t1 (a int);
Query OK, 0 rows affected (0.06 sec)

mysql> insert into t1 values(1);
Query OK, 1 row affected (0.00 sec)

mysql> show binary logs;
+-------------------+-----------+
| Log_name          | File_size |
+-------------------+-----------+
| master-bin.000001 |       458 |
+-------------------+-----------+
1 row in set (0.00 sec)

mysql> set @@global.debug='+d,simulate_binlog_flush_error';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t1 values(2);
Query OK, 1 row affected (0.00 sec)

mysql> show binary logs;
+-------------------+-----------+
| Log_name          | File_size |
+-------------------+-----------+
| master-bin.000001 |       458 |
+-------------------+-----------+
1 row in set (0.00 sec)

mysql> select * from t1;
+------+
| a    |
+------+
|    1 |
|    2 |
+------+
2 rows in set (0.00 sec)

Suggested fix:
Abort all the transactions participating in group commit if binlog sync/flush fails
[23 Apr 2015 5:04] Santosh Praneeth Banda
Please use InnoDB table when reproducing the issue

mysql> create table t1 (a int) engine=innodb;
[23 Apr 2015 8:28] MySQL Verification Team
Hello Santosh,

Thank you for the bug report and test case.
Observed this issue with 5.6.23.

Thanks,
Umesh
[23 Apr 2015 8:37] MySQL Verification Team
test results

Attachment: 76795.txt (text/plain), 8.97 KiB.

[14 May 2015 10:53] Venkatesh Duggirala
An approach which we are following to fix this issue is :
========================================================

Using binlog_error_action setting, Server will decide the action to do if the
above problematic situation happens.

If binlog_error_action is ABORT_SERVER, then it will abort the server after
informing the client with 'ER_BINLOGGING_IMPOSSIBLE' error.

If binlog_error_action is 'IGNORE_ERROR', then it will ignore the error and
disable the binlogging further until server is restarted again. The same will
be mentioned in the error log file. All the transaction will be committed
inside storage engine with out binlogging.

Regards,
Venkatesh.
[28 May 2015 11:50] Venkatesh Duggirala
Patch that implements the above approach

Attachment: bug76795.diff (text/x-patch), 6.62 KiB.

[28 May 2015 11:51] Venkatesh Duggirala
Post from the Developer:
========================

Please find the above patch that will fix the issue. Give it a try at your end and let us know if you see any issues with it
[28 May 2015 14:01] David Moss
Thanks for your feedback. This has been fixed in upcoming versions and the following was noted in the 5.6.26 and 5.7.8 changelogs:

If flushing the cache to the binary log failed, for example due to a disk problem, the error was not detected by the binary log group commit logic. This could cause inconsistencies between the master and the slave. The fix uses the binlog_error_action variable to decide how to handle this situation. If binlog_error_action=ABORT_SERVER, then the server aborts after informing the client with an ER_BINLOGGING_IMPOSSIBLE error. If binlog_error_action=IGNORE_ERROR, then the error is ignored and binary logging is disabled until the server is restarted again. The same is mentioned in the error log file, and the transaction is committed inside the storage engine without being added to the binary log.
[28 May 2015 17:53] Santosh Praneeth Banda
You completely missed the scenario where binlog sync fails. My bug title and description mentions that, but my sample test case covered binlog flush case. You need to handle error from MYSQL_BIN_LOG::sync_binlog_file() in group commit. FWIW, this is the patch we are using

diff --git a/sql/binlog.cc b/sql/binlog.cc
index ad7ffb1..bef4a99 100644
--- a/sql/binlog.cc
+++ b/sql/binlog.cc
@@ -7365,6 +7365,7 @@ int MYSQL_BIN_LOG::ordered_commit(THD *thd, bool all, bool skip_commit,
   int flush_error= 0;
   my_off_t total_bytes= 0;
   bool do_rotate= false;
+  int binlog_flush_error = 0;
 
   /*
     These values are used while flushing a transaction, so clear
@@ -7426,7 +7427,7 @@ int MYSQL_BIN_LOG::ordered_commit(THD *thd, bool all, bool skip_commit,
 
   my_off_t flush_end_pos= 0;
   if (flush_error == 0 && total_bytes > 0)
-    flush_error= flush_cache_to_file(&flush_end_pos);
+    binlog_flush_error = flush_error = flush_cache_to_file(&flush_end_pos);
 
   /*
     If the flush finished successfully, we can call the after_flush
@@ -7452,7 +7453,7 @@ int MYSQL_BIN_LOG::ordered_commit(THD *thd, bool all, bool skip_commit,
     {
       DEBUG_SYNC(thd, "before_sync_binlog_file");
       std::pair<bool, bool> result = sync_binlog_file(false, async);
-      flush_error = result.first;
+      binlog_flush_error = flush_error = result.first;
     }
 
     /*
@@ -7467,11 +7468,23 @@ int MYSQL_BIN_LOG::ordered_commit(THD *thd, bool all, bool skip_commit,
       that haven't made it to the disk on master because of a os
       crash or power failure just before binlog fsync.
     */
-    update_binlog_end_pos();
+    if (!binlog_flush_error)
+      update_binlog_end_pos();
 
     DBUG_EXECUTE_IF("crash_commit_after_log", DBUG_SUICIDE(););
   }
 
+  if (binlog_flush_error)
+  {
+    sql_print_error("Binlog flush or sync failed");
+    if (binlog_error_action == ABORT_SERVER)
+    {
+      sql_print_error("Aborting the server to avoid inconsistencies between "
+                      "binlog and innodb");
+      _exit(EXIT_FAILURE);
+    }
+  }
+
[29 May 2015 5:09] Venkatesh Duggirala
Hello Santosh,

Thank you for verifying the patch.

Yes, you are correct , I missed case-2 (sync failure)

We will work on covering the case-2 and will submit the patch soon.

Sorry for the inconvenience caused.

Regards,
Venkatesh.
[13 Oct 2015 3:04] Venkatesh Duggirala
(Patch that addresses both flush and sync)  errors.

Attachment: patch_mysql-5.6.diff (text/x-patch), 16.74 KiB.

[13 Oct 2015 3:05] Venkatesh Duggirala
Post from the Developer:
========================

Please find the above patch that will handle both flush and sync errors.
Give it a try at your end and please let us know if you see any issues with it
[16 Oct 2015 14:31] David Moss
Thanks for your feedback, this has been fixed and the following was noted in the 5.6.28 and 5.7.10 changelogs:

Fatal errors encountered during flushing or synchronizing the binary log were being ignored. Such errors are now caught and handled depending on the setting of binlog_error_action.

Additionally the documentation for binlog_error_action has also been improved:
https://dev.mysql.com/doc/refman/5.7/en/replication-options-binary-log.html#sysvar_binlog_...