Bug #40510 LOAD DATA INFILE statement may span multiple binlogs
Submitted: 4 Nov 2008 18:44 Modified: 10 Dec 2008 16:57
Reporter: Sven Sandberg Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.0, 5.1, 6.0 OS:Any
Assigned to: Andrei Elkin CPU Architecture:Any
Tags: binlog, LOAD DATA INFILE

[4 Nov 2008 18:44] Sven Sandberg
Description:
In STATEMENT and MIXED mode, LOAD DATA INFILE statements are logged as several events: one Create_file_log_event, followed by zero or more Append_block_log_event, followed by one Execute_load_query_log_event. If @@autocommit = 1, it may rotate the binlog after any of the events. (Rotation happens if the binlog size reaches the limit specified by @@global.max_binlog_size).

Rotation in the middle of the statement has the following consequences:
 - mysqlbinlog cannot parse the binlog correctly.
 - Normal replication still works fine.
 - I'm unsure if there are any problems with things like START SLAVE UNTIL etc.

How to repeat:
source include/have_log_bin.inc;
source include/have_binlog_format_statement.inc;

# Make it rotate more often
SET @@global.max_binlog_size = 65536;

CREATE TABLE t1 (a VARCHAR(50));

# Create a 200 KB file
INSERT INTO t1 VALUES ('01234567890123456789012345678901234567890123456789');
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
INSERT INTO t1 SELECT * FROM t1;
SELECT COUNT(*) FROM t1;
SELECT * FROM t1 INTO OUTFILE 'test/tmp.tmp';
DELETE FROM t1;

# Load the 200 KB file: this will be split over three binlogs
LOAD DATA INFILE 'tmp.tmp' INTO table t1;

# Print the contents of the three binlogs
let $MYSQLD_DATADIR = `SELECT @@datadir`;
--echo ================ binlog 1 ================
exec $MYSQL_BINLOG $MYSQLD_DATADIR/master-bin.000001;
--echo ================ binlog 2 ================
exec $MYSQL_BINLOG $MYSQLD_DATADIR/master-bin.000002;
--echo ================ binlog 3 ================
exec $MYSQL_BINLOG $MYSQLD_DATADIR/master-bin.000003;

Suggested fix:
1. Ensure that LOAD DATA INFILE cannot span multiple binlogs.

2. Make LOAD DATA INFILE an unsafe statement. The implementation is quite fragile anyways, suffering from problems like BUG#34283 and BUG#33238.
[11 Nov 2008 15:34] Susanne Ebrecht
Verified as described by using actual MySQL 5.1 bzr tree.
[10 Dec 2008 9:19] Andrei Elkin
Yes, LOAD FILE indeed can produce multiple binlogs.
Still that's not an issue.
My testing (5.1) shows mysqlbinlog has to read all the created files starting
from the first and the table gets populated successfully:

   mysql -uroot -h127.0.0.1 -P9306 test -e "drop table t1";

   Mybinlog.sh var/log/master-bin.000001 \ 
                var/log/master-bin.000002 \
                 var/log/master-bin.000003 | 
      mysql -uroot -h127.0.0.1 -P9306 test

(the files contain CREATE TABLE t1 and LOAD DATE of the report)

So I think it's not a bug.
[10 Dec 2008 16:57] Sven Sandberg
OK, with Andrei's explanation I think this may be 'not a bug'.