MySQL Bugs: #32228: A disk full makes binary log corrupt.

Bug #32228	A disk full makes binary log corrupt.
Submitted:	9 Nov 2007 13:18	Modified:	29 Oct 2011 10:27
Reporter:	Arnoud Witt	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S2 (Serious)
Version:	4.1, 5.0, 5.1	OS:	Any
Assigned to:	Sven Sandberg	CPU Architecture:	Any
Tags:	bfsm_2007_11_15, bfsm_2007_12_06

Description:
A disk full occurred on the master, then mysqld on the slave reported in the .err log:

070914 12:50:06  mysqld started
/usr/sbin/mysqld: ready for connections.
Version: '4.1.12-standard-log'  socket: '/tmp/mysql.sock4'  port: 3306  MySQL Community Edition - Standard (GPL)
070914 12:50:07 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 0, relay log './qa-dbslave1-relay-bin.0000
01' position: 4
070914 12:50:07 [Note] Slave I/O thread: connected to master 'replicate@10.32.68.13:3306',  replication started in log 'FIRST' at position 4
071006  8:29:33 [ERROR] Error reading packet from server: binlog truncated in the middle of event (server_errno=1236)
071006  8:29:33 [ERROR] Got fatal error 1236: 'binlog truncated in the middle of event' from master when reading data from binary log
071006  8:29:33 [ERROR] Slave I/O thread exiting, read up to log 'qa-dbmaster-bin.000003', position 656764771
071008 11:32:12 [Note] Slave I/O thread: connected to master 'replicate@10.32.68.13:3306',  replication started in log 'qa-dbmaster-bin.000003'
 at position 656764771
071008 11:32:12 [ERROR] Error reading packet from server: bogus data in log event (server_errno=1236)
071008 11:32:12 [ERROR] Got fatal error 1236: 'bogus data in log event' from master when reading data from binary log
071008 11:32:12 [ERROR] Slave: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for
the right syntax to use near '[^GG^E^M' at line 4' on query. Default database: 'fp12_mp'. Query: 'UPDATE `event` SET
                            `status` = 'processing',
                            `worker_id` = '4',
h[^GG^E^M', Error_code: 1064
071008 11:32:12 [ERROR] Slave I/O thread exiting, read up to log 'qa-dbmaster-bin.000003', position 656765214
071008 11:32:12 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We
 stopped at log 'qa-dbmaster-bin.000003' position 656764771

On master the error log looked like this:

070914 12:50:06  mysqld started
/usr/sbin/mysqld: ready for connections.
Version: '4.1.12-standard-log'  socket: '/tmp/mysql.sock4'  port: 3306  MySQL Community Edition - Standard (GPL)
071006 11:54:48 [ERROR] /usr/sbin/mysqld: Disk is full writing './qa-dbmaster-bin.000003' (Errcode: 28). Waiting for someone to free space... R
etry in 60 secs
071006 12:04:48 [ERROR] /usr/sbin/mysqld: Disk is full writing './qa-dbmaster-bin.000003' (Errcode: 28). Waiting for someone to free space... R
etry in 60 secs
071006 12:14:48 [ERROR] /usr/sbin/mysqld: Disk is full writing './qa-dbmaster-bin.000003' (Errcode: 28). Waiting for someone to free space... R
etry in 60 secs
071006 12:24:48 [ERROR] /usr/sbin/mysqld: Disk is full writing './qa-dbmaster-bin.000003' (Errcode: 28). Waitin071017 17:16:25 [Note] /usr/sbin
/mysqld: Normal shutdown

071017 17:16:27 [Note] /usr/sbin/mysqld: Shutdown complete

071017 17:16:27  mysqld ended

You can see that the server could not write into the error_log anymore as well.

So the binary log got corrupted and the slave could not recover replication after freeing the disk on the master.

I think that mysqld wrote to the binary log an event and it did not get fully written into the log.
Then with the next event, the OS reports the disk full and mysqld will retry writing then. When room is available again, event is just appended to binlog after the incomplete previous event.

How to repeat:
See description.

Suggested fix:
Let mysqld check for enough space in the binary log before writing into it.
If there is not enough space, postpone writing until there is room again.

Thank you for the report.

I can not repeat described behaviour with current version of MySQL. Please upgrade.

Verified against 4.1.12 using InnoDB and transactions. Could not reproduce using MyISAM only.

Confirmed on Debian Etch, Mysql 5.0.32  

Disk full on master, after space is freed up we get:

May  5 08:29:17 colo2 mysqld[2306]: 080505  8:29:17 [Note] Slave I/O thread: connected to master 'repl@dbmaster:3306',  replication started in log 'mysql-bin.010495' at position 693169273
May  5 08:29:17 colo2 mysqld[2306]: 080505  8:29:17 [ERROR] Error reading packet from server: binlog truncated in the middle of event ( server_errno=1236)
May  5 08:29:17 colo2 mysqld[2306]: 080505  8:29:17 [ERROR] Got fatal error 1236: 'binlog truncated in the middle of event' from master when reading data from binary log
May  5 08:29:17 colo2 mysqld[2306]: 080505  8:29:17 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.010495', position 693169273

Confirmed in:
mysql  Ver 14.12 Distrib 5.0.22, for redhat-linux-gnu (i686) using readline 5.0

Is there a known workaround to get replication up and running again?

It appears the workaround is to start replication from scratch.
http://lists.mysql.com/mysql/208931

See also: WL#1040.

It's not completely clear what happens.

Here, "partial event" refers to an event that is truncated, so it does not have full length. "corrupt event" refers to an event that has full length but contains wrong data.

In principle, the server should be able to handle partial events because every event has a field that contains the event length. Here it seems that a partial event is at some point changes to a corrupt event. I think there are three possible scenarios.

==== Scenario 1: master client turns partial event to corrupt event ====

(S1-1) The disk is full on the master. This causes the master client that writes to the binlog to write a partial event and wait for disk space. When disk space is freed, there is a bug in the master client thread that causes it to write a corrupt event to the binlog.

(S1-2) The master dump thread reads the corrupt event and sends to the slave.

(S1-3) The slave IO thread receives the corrupt event and writes to the relay log.

(S1-4) The slave SQL thread reads the corrupt event and fails to execute it.

==== Scenario 2: master dump thread turns partial event to corrupt event ====

(S2-1) The disk is full on master. This causes the master client that writes to the binlog to write a partial event and wait for disk space.

(S2-2) The master dump thread cannot handle the partial event correctly, and sends a corrupt event to the slave.

(S2-3) The slave IO thread receives the corrupt event and writes to the relay log.

(S2-4) The slave SQL thread reads the corrupt event and fails to execute it.

==== Scenario 3: slave IO thread turns partial event to corrupt event ====

(S3-1) The disk is full on master. This causes the master client that writes to the binlog to write a partial event and wait for disk space.

(S3-2) The master dump thread sends the partial event to the slave.

(S3-3) The slave IO thread receives the partial event but cannot handle it correctly, and writes a corrupt event to the relay log.

(S3-4) The slave SQL thread reads the corrupt event and fails to execute it.

I can repeat this on 5.1.51 using myisam tables.  master gets error:

[ERROR] mysqld-debug: Disk is full writing '.\anon-bin.000012' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
[ERROR] mysqld-debug: Retry in 60 secs. Message reprinted in 600 secs
[Note] mysqld-debug: Normal shutdown

After starting up master again, slave broken:
101028 15:53:38 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'binlog truncated in the middle of event', Error_code: 1236
101028 15:53:38 [Note] Slave I/O thread exiting, read up to log 'anon-bin.000012', position 227848188

Running mysqlbinlog on the binlog shows:

[sbester@levovo 5.5]$ mysqlbinlog -v ~/anon-bin.000012 > out
ERROR: Error in Log_event::read_log_event(): 'read error', data_len: 405117, event_type: 2
ERROR: Could not read entry at offset 227848188: Error in log format or read error.

I uploaded my binlog to the ftp.mysql.com/pub/mysql/upload/anon-bin.000012.bz2

The bug shows two different symptoms:

 (S1) The master seems to send truncated event to the slave: the slave
      error log contained:
        071006  8:29:33 [ERROR] Error reading packet from server: binlog
        truncated in the middle of event (server_errno=1236)
      This is strange: it looks like the master knows that the event is
      truncated (because it sends an error code to the slave), but still
      the master send the event to slave. It would probably be better that
      the master did not let the slave know about partial events. In any
      case, I haven't been able to reproduce this.

 (S2) Somewhere along the way, a truncated event becomes a corrupt event:
        071008 11:32:12 [ERROR] Slave: Error 'You have an error in your SQL
        syntax; [...]
      See also my comment from 26 Oct 15:37. I haven't been able to reproduce
      this either.

I have tried to reproduce this bug using a test case similar to the one in BUG#57936, see below. By default, it hits the assertion in BUG#57936. However, if this assertion is commented out, it works gracefully: the master continues without problems and there are no errors on the slave. I don't know what else I could do to reproduce this bug.

==== BEGIN rpl_disk_full ====
--source include/master-slave.inc
--source include/have_binlog_format_statement.inc

--let $master_dir= `SELECT @@datadir`
--let $master_dir= $master_dir/binlogs
--let $master_file= $master_dir/bigfile
--let $long_string= `SELECT REPEAT('a', 500000)`

--eval SELECT REPEAT('a', 750000) INTO DUMPFILE '$master_file'

CREATE TABLE t1 (a TEXT(1000000));
--disable_query_log
--send
--eval INSERT INTO t1 VALUES ('$long_string')
--enable_query_log
--sleep 25
--remove_file $master_file
--reap
SELECT COUNT(*) FROM t1;
--sync_slave_with_master

--let $diff_table_1= master:test.t1
--let $diff_table_2= slave:test.t1
--source include/diff_tables.inc

--connection master
DROP TABLE t1;
--sync_slave_with_master
==== END ====

==== BEGIN rpl_disk_full-master.sh ====
dir=$MYSQLTEST_VARDIR/mysqld.1/data/binlogs
img=$dir.fs
echo dir=$dir img=$img
mkdir $dir
head -c 1000000 /dev/zero > $img
mkfs.ext2 -F $img
sudo mount -o loop $img $dir
==== END ====

==== BEGIN rpl_disk_full-master.opt ====
--log-bin=$MYSQLTEST_VARDIR/mysqld.1/data/binlogs/master-bin
==== END ====

==== BEGIN patch to remove assertion ====
=== modified file 'sql/sql_class.cc'
--- sql/sql_class.cc	2010-10-23 12:55:44 +0000
+++ sql/sql_class.cc	2010-11-02 20:36:59 +0000
@@ -496,11 +496,11 @@
 void
 Diagnostics_area::set_ok_status(THD *thd, ha_rows affected_rows_arg,
                                 ulonglong last_insert_id_arg,
                                 const char *message_arg)
 {
-  DBUG_ASSERT(! is_set());
+  //DBUG_ASSERT(! is_set());
 #ifdef DBUG_OFF
   /*
     In production, refuse to overwrite an error or a custom response
     with an OK packet.
   */
==== END ====

I can reproduce that the slave gets a truncated event from the master as follows:

1. run statement on master such that disk gets full in the middle of binlog event
2. shut down master while it waits for disk space
3. free up disk space on master
4. restart master

Use the same test files as above, but replace rpl_disk_full with this file:

==== BEGIN rpl_disk_full.test (version 2) ====

--source include/master-slave.inc
--source include/have_binlog_format_statement.inc

SET SESSION debug = "+d,short_sleep_on_disk_full";

--let $master_dir= `SELECT @@datadir`
--let $master_dir= $master_dir/binlogs
--let $master_file= $master_dir/bigfile

--let $s= `SELECT REPEAT('a', 100000)`
--eval SELECT REPEAT('a', 900000) INTO DUMPFILE '$master_file'

--echo # create table...
CREATE TABLE t1 (a TEXT(1000000));

--echo # execute big statement...
--disable_query_log
--send
--eval INSERT INTO t1 VALUES ('$s$s')
--enable_query_log
--sleep 5

--connection master1

--echo # shut down server...
--write_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
wait
EOF
--shutdown_server 10

--echo # remove file...
--remove_file $master_file
--echo # removed all files

--echo # restart server again...
--append_file $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
restart
EOF

--echo # reconnect master1 again...
--enable_reconnect
--source include/wait_until_connected_again.inc
--disable_reconnect

--echo # reconnect master again...
--connection master
# 2013=CR_SERVER_LOST
--error 2013
--reap
--enable_reconnect
--source include/wait_until_connected_again.inc
--disable_reconnect

SELECT COUNT(*) FROM t1;
--connection slave

# 1594=ER_SLAVE_RELAY_LOG_READ_FAILURE
--let $slave_sql_errno= 1594
--source include/wait_for_slave_sql_error.inc
--let $slave_datadir= `SELECT @@datadir`

--exec $MYSQL_BINLOG $slave_datadir/slave-relay-bin.000004
==== END ====

The bug is in mysql_binlog_send, where it does this:

     if (!(error = Log_event::read_log_event(&log, packet, log_lock)))
       [...]
     else
     {
       if (test_for_non_eof_log_read_errors(error, &errmsg))
         goto err;
       /*
         It's EOF, nothing to do, go on reading next events, the
         Format_description_log_event will be found naturally if it is written.
       */
     }

Here, read_log_event returns LOG_READ_TRUNC when the event is shorter than expected, and test_for_non_eof_log_read_errors returns 0 when error==LOG_READ_TRUNC.

So the master dump thread detects that the event is truncated, but ignores the error and pretends everything is fine. It should probably just goto err instead.

test_for_non_eof_log_read_errors is used in some other places in this file too. We should check if we need to fix those too.

Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented fix in the 5.518 and 5.6.4 changelogs as follows:

        A replication master could send damaged events to slaves after
        the binary log disk on the master became full. To correct this
        issue, only complete events are now pushed by the master dump
        thread to the slave I/O thread. In addition, the error text that
        the master sends to the slave when an incomplete event is found
        now states that the incomplete event may have been caused by
        running out of disk space on the master, and provides
        coordinates of the first and the last event bytes read.

Closed.

The fix for this bug unfortunately introduced even worse bugs filed as Bug 64035. The server now crashes when generating several error messages related to replication.

It looks like the testcase for this bug (rpl.rpl_cant_read_event_incident) is only present in the 5.5 tree. Is that intentional? If 5.6 or 5.7 were to regress WRT current bug, would this be caught? Filed bug 82370.