Bug #58637 Mark INSERT...ON DUPLICATE KEY UPDATE unsafe when there is more than one key
Submitted: 1 Dec 2010 19:44 Modified: 20 Apr 2012 12:37
Reporter: Sven Sandberg Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.1+, 5.0, 5.6.1 OS:Any
Assigned to: Sven Sandberg CPU Architecture:Any
Tags: insert on duplicate key update, replication, unsafe

[1 Dec 2010 19:44] Sven Sandberg
Description:
When mysql executes INSERT ON DUPLICATE KEY INSERT, the storage engine checks if the inserted row would generate a duplicate key error. If yes, it returns the existing row to mysql, mysql updates it and sends it back to the storage engine.

When the table has more than one unique or primary key, this statement is sensitive to the order in which the storage engines checks the keys. Depending on this order, the storage engine may determine different rows to mysql, and hence mysql can update different rows.

The order that the storage engine checks keys is not deterministic. For example, InnoDB checks keys in an order that depends on the order in which indexes were added to the table. The first added index is checked first. So if master and slave have added indexes in different orders, then slave may go out of sync.

There are realistic scenarios when this can happen. For example, maybe the slave has not started when master creates a table without indexes, then adds indexes.  Then the slave is bootstrapped using mysqldump. The slave will then add all indexes at once, using a single CREATE TABLE statement.

This bug does not affect REPLACE, since REPLACE removes *all* existing duplicate rows, one by one, until the new row can be inserted without error.

How to repeat:
--source include/have_innodb.inc
--source include/master-slave.inc

CREATE TABLE t1 (a INT, b INT UNIQUE KEY) ENGINE = InnoDB;
ALTER TABLE t1 ADD UNIQUE KEY(a);
--sync_slave_with_master
# Same table definition, only given in one statement instead of two
DROP TABLE t1;
CREATE TABLE t1 (a INT UNIQUE KEY, b INT UNIQUE KEY) ENGINE = InnoDB;
--connection master

INSERT INTO t1 VALUES (1, 1);
INSERT INTO t1 VALUES (2, 2);
INSERT INTO t1 VALUES (1, 2)
       ON DUPLICATE KEY UPDATE a=VALUES(a)+10, b=VALUES(b)+10;
SELECT * FROM t1;
--sync_slave_with_master
SELECT * FROM t1;

Suggested fix:
Mark INSERT ... ON DUPLICATE KEY UPDATE unsafe when there are more than one indexes.
[1 Dec 2010 19:47] Sven Sandberg
See also BUG#50439.
[1 Dec 2010 20:53] Sveta Smirnova
Thank you for the report.

Verified as described.
[20 Apr 2012 12:37] Jon Stephens
Fixed in 5.5+. Documented in the 5.5.24 and 5.6. changelogs as follows:

        INSERT ON DUPLICATE KEY UPDATE is now marked as unsafe for
        statement-based replication if the target table has more than
        one primary or unique key. For more information, see
        <xref linkend="replication-rbr-safe-unsafe"/>.

Updated info in SQL Syntax chapter and "Determination of Safe and Unsafe
Statements" in the 5.5 and 5.6 versions of the Manual.

Closed.
[7 Jan 2013 13:26] Victor Efimov
This fix is kind of inconvenient.

At least if there are two indexes and one of the indexes is auto_increment, there is no bug with replication and statement is safe.
[30 Apr 2013 2:31] Jervin R
Based on my test, on tables with a PRIMARY KEY and UNIQUE KEY where the INSERT statement touches both, the PRIMARY KEY is always evaluated first regardless if it was added last:

mysql [localhost] {msandbox} (test) > CREATE TABLE x ( `x` int(10) unsigned NOT NULL,   `c` char(2) DEFAULT NULL,   UNIQUE KEY `x` (`x`) ) ENGINE=InnoDB; 
Query OK, 0 rows affected (0.11 sec)

mysql [localhost] {msandbox} (test) > insert into x (x, c) values (1, 'ab'), (2, 'cd'); 
Query OK, 2 rows affected (0.05 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql [localhost] {msandbox} (test) > alter table x add column id int unsigned  not null auto_increment primary key first;   
Query OK, 2 rows affected (0.25 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql [localhost] {msandbox} (test) > select * from x;
+----+---+------+
| id | x | c    |
+----+---+------+
|  1 | 1 | ab   |
|  2 | 2 | cd   |
+----+---+------+
2 rows in set (0.00 sec)

mysql [localhost] {msandbox} (test) > INSERT INTO x (id, x, c) VALUES (1, 2, 'ac') ON DUPLICATE KEY UPDATE c = 'ax'; 
Query OK, 2 rows affected (0.07 sec)

mysql [localhost] {msandbox} (test) > select * from x;
+----+---+------+
| id | x | c    |
+----+---+------+
|  1 | 1 | ax   |
|  2 | 2 | cd   |
+----+---+------+
2 rows in set (0.00 sec)

I think this should also be documented or added to exception.
[29 May 2013 11:02] George Kaz
Although I appreciate the reason for the warning being added, in the case of only specifying a single key within the insert values, this warning is unnecessary. In my case, it's bloating my error logs.

Example of when this implements wrongly (in this case a wordpress table but the problem is generic)

CREATE TABLE `wp_options` (
  `option_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `option_name` varchar(64) NOT NULL DEFAULT '',
  `option_value` longtext NOT NULL,
  `autoload` varchar(20) NOT NULL DEFAULT 'yes',
  PRIMARY KEY (`option_id`),
  UNIQUE KEY `option_name` (`option_name`)
) ENGINE=MyISAM AUTO_INCREMENT=45036 DEFAULT CHARSET=utf8

INSERT INTO wp_options(option_name, option_value, autoload) Values ('myuniquename', 'blahblahblah', 'autoload') ON DUPLICATE KEY UPDATE

The above produces an unsafe statement warning when it's not unsafe because the only key being referenced in the insert is option_name.

Any chance of a fix for this?
[9 Jun 2014 17:14] Jeff Marr
I ran into the same issue as George and opened a new ticket here:
http://bugs.mysql.com/bug.php?id=72921
[25 Jun 2014 14:34] Koumpilai Tsil Kara
Are we saying here that if the tables contain PRIMARY KEY we are safe from this? Thanks