Bug #50439 mark INSERT...SEL...ON DUP KEY UPD,REPLACE...SEL,CREATE...[IGN|REPL] SEL unsafe
Submitted: 19 Jan 2010 11:16 Modified: 30 Sep 2011 18:19
Reporter: Sven Sandberg Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.1+ OS:Any
Assigned to: Sven Sandberg CPU Architecture:Any
Tags: create ignore select, create replace select, insert ignore select, insert select on duplicate key update, REPLACE SELECT, replication, unsafe, Update Ignore

[19 Jan 2010 11:16] Sven Sandberg
Description:
The following statements can cause the slave to go out of sync if logged in statement format:

  INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
  REPLACE ... SELECT
  CREATE TABLE ... IGNORE SELECT
  CREATE TABLE ... REPLACE SELECT

The order that rows are retrieved from SELECT may differ between master and slave. If the result set returned by SELECT contains two rows with duplicate keys, and those two rows are retrieved in different order on slave than on master, then the slave will go out of sync (in different ways depending on how the duplicate key is resolved).

How to repeat:
--source include/have_binlog_format_mixed.inc
--source include/master-slave.inc

CREATE TABLE t1 (a INT, b INT);
CREATE TABLE t2 (a INT, b INT, PRIMARY KEY(b));
CREATE TABLE t3 (a INT, b INT, PRIMARY KEY(b));

INSERT INTO t1 VALUES (1, 1), (2, 1);

--echo [on slave]
--sync_slave_with_master
DELETE FROM t1;
INSERT INTO t1 VALUES (2, 1), (1, 1);

--echo [on master]
--connection master

INSERT INTO t2 SELECT * FROM t1 ON DUPLICATE KEY UPDATE b = 2;
REPLACE INTO t3 SELECT * FROM t1;
CREATE TABLE t4 (a INT, b INT, PRIMARY KEY(b)) IGNORE SELECT * FROM t1;
CREATE TABLE t5 (a INT, b INT, PRIMARY KEY(b)) REPLACE SELECT * FROM t1;

SELECT * FROM t2;
SELECT * FROM t3;
SELECT * FROM t4;
SELECT * FROM t5;

--sync_slave_with_master

SELECT * FROM t2;
SELECT * FROM t3;
SELECT * FROM t4;
SELECT * FROM t5;

--exit

Suggested fix:
Mark all these statement types as unsafe, so that:
 - the statements will be logged in row format if binlog_format=mixed
 - the statements will generate a warning if binlog_format=statement

Note: we may add the condition that the statement is not marked unsafe if the SELECT has an ORDER BY primary key clause.
[19 Jan 2010 11:27] Sven Sandberg
See also BUG#50440.
[19 Jan 2010 13:01] Valeriy Kravchuk
Verified just as described with recent 5.43 from bzr:

openxs@suse:/home2/openxs/dbs/5.1/mysql-test> ./mtr --mysqld=--binlog_format=mixed bug50439
Logging: ./mtr  --mysqld=--binlog_format=mixed bug50439
100426 19:39:52 [Note] Plugin 'FEDERATED' is disabled.
100426 19:39:52 [Note] Plugin 'ndbcluster' is disabled.
MySQL Version 5.1.43
Using binlog format 'mixed'
Checking supported features...
 - using ndbcluster when necessary, mysqld supports it
 - SSL connections supported
 - binaries are debug compiled
Collecting tests...
vardir: /home2/openxs/dbs/5.1/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home2/openxs/dbs/5.1/mysql-test/var'...
Installing system database...
Using server port 32775
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009

==============================================================================

TEST                                      RESULT   TIME (ms)
------------------------------------------------------------

main.bug50439                            [ fail ]
        Test ended at 2010-04-26 19:39:55

CURRENT_TEST: main.bug50439
--- /home2/openxs/dbs/5.1/mysql-test/r/bug50439.result  2010-04-26 19:37:14.000000000 +0300
+++ /home2/openxs/dbs/5.1/mysql-test/r/bug50439.reject  2010-04-26 19:39:54.000000000 +0300
@@ -0,0 +1,42 @@
+stop slave;
+drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
+reset master;
+reset slave;
+drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
+start slave;
+CREATE TABLE t1 (a INT, b INT);
+CREATE TABLE t2 (a INT, b INT, PRIMARY KEY(b));
+CREATE TABLE t3 (a INT, b INT, PRIMARY KEY(b));
+INSERT INTO t1 VALUES (1, 1), (2, 1);
+[on slave]
+DELETE FROM t1;
+INSERT INTO t1 VALUES (2, 1), (1, 1);
+[on master]
+INSERT INTO t2 SELECT * FROM t1 ON DUPLICATE KEY UPDATE b = 2;
+REPLACE INTO t3 SELECT * FROM t1;
+CREATE TABLE t4 (a INT, b INT, PRIMARY KEY(b)) IGNORE SELECT * FROM t1;
+CREATE TABLE t5 (a INT, b INT, PRIMARY KEY(b)) REPLACE SELECT * FROM t1;
+SELECT * FROM t2;
+a      b
+1      2
+SELECT * FROM t3;
+a      b
+2      1
+SELECT * FROM t4;
+a      b
+1      1
+SELECT * FROM t5;
+a      b
+2      1
+SELECT * FROM t2;
+a      b
+2      2
+SELECT * FROM t3;
+a      b
+1      1
+SELECT * FROM t4;
+a      b
+2      1
+SELECT * FROM t5;
+a      b
+1      1
[25 Jan 2010 17:12] Sven Sandberg
Addition: UPDATE IGNORE can also cause slave to go out of sync and should be marked unsafe.

This is slightly more complicated to reproduce, since UPDATE IGNORE will only IGNORE rows that are duplicates according to a UNIQUE KEY or PRIMARY KEY. When the key exists, rows will be ordered by the key, so row order is not undefined any more. However, if there is another key, then that may be used instead, and the bug is reproducible.

Test case:

--source include/have_binlog_format_statement.inc
--source include/master-slave.inc

CREATE TABLE t (a INT, b INT, c INT, UNIQUE KEY(a), UNIQUE KEY(b));
INSERT INTO t VALUES (NULL, 1, 1);
INSERT INTO t VALUES (NULL, 2, 2);

--sync_slave_with_master

DELETE FROM t;
INSERT INTO t VALUES (NULL, 2, 2);
INSERT INTO t VALUES (NULL, 1, 1);

connection master;
UPDATE IGNORE t SET b=3;

SELECT * FROM t;
--sync_slave_with_master
SELECT * FROM t;
[15 Jul 2010 15:08] Sven Sandberg
Given the recent concerns about unnecessary warnings for unsafe statements, we should avoid marking statements unsafe when there is an ORDER BY primary_key clause. This will be easier to implement this after BUG#42415. So I will not fix this bug until BUG#42415 is ready.
[22 Jul 2010 13:58] Sven Sandberg
See also BUG#55481
[5 Oct 2010 12:54] Sven Sandberg
Same problem with INSERT IGNORE...SELECT:

--source include/have_binlog_format_mixed.inc
--source include/master-slave.inc

CREATE TABLE t1 (a INT, b INT);
CREATE TABLE t2 (a INT, b INT, PRIMARY KEY(b));

INSERT INTO t1 VALUES (1, 1), (2, 1);

--echo [on slave]
--sync_slave_with_master
DELETE FROM t1;
INSERT INTO t1 VALUES (2, 1), (1, 1);

--echo [on master]
--connection master

INSERT IGNORE INTO t2 SELECT * FROM t1;
SELECT * FROM t2;
--sync_slave_with_master
SELECT * FROM t2;

--exit

So, summary of affected commands:

INSERT IGNORE...SELECT
INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
REPLACE ... SELECT
UPDATE IGNORE
CREATE ... IGNORE SELECT
CREATE ... REPLACE SELECT
[30 Sep 2011 18:19] Jon Stephens
Documented in the 5.5.18 and 5.6.4 changelogs as follows:

        The statements in the following list are now marked as unsafe for
        statement-based replication. This is due to the fact that each of these
        statements depends on the results of a SELECT statement whose order 
        cannot always be determined. When using STATEMENT logging mode, a
        warning is issued in the binary log for any of these statements; when
        using MIXED logging mode, the statement is logged using the row-based
        format.
      
            INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
            
            REPLACE ... SELECT
            
            CREATE TABLE ... IGNORE SELECT
            
            CREATE TABLE ... REPLACE SELECT
        
        When upgrading, you should note the use of these statements in your
        applications, keeping in mind that a statement that inserts or replaces
        rows obtained from a SELECT can take up many times as much space in the
        binary log when logged using row-based format than when only the
        statement itself is logged. Depending on the number and size of the rows
        selected and inserted (or replaced) by any such statements, the
        difference in size of the binary log after the logging of these
        statements is switched from statement-based to row-based can potentially
        be several orders of magnitude. 
        
        See http://dev.mysql.com/doc/refman/5.5/en/replication-sbr-rbr.html.
      
Also noted the behaviour change in the descriptions of the affected statements in the 5.5./5.6 Manuals.
[26 Oct 2011 10:45] Jon Stephens
Updated changelog entry and docs to include moention of INSERT IGNORE ... SELECT and UPDATE IGNORE statements per Rohit's email 20110930.