MySQL Bugs: #70202: REPLACE working differently for a table having PK or UK only versus both PK+UK

Bug #70202	REPLACE working differently for a table having PK or UK only versus both PK+UK
Submitted:	30 Aug 2013 15:36	Modified:	2 Sep 2013 11:51
Reporter:	Przemyslaw Malkowski	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: DML	Severity:	S3 (Non-critical)
Version:	5.5, 5.6	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	innodb, REPLACE

Description:
Seems like InnoDB engine has an optimization that allows using update logic when replacing existing unique values for a row instead of delete+insert. Unfortunately this optimization does not work for table having both primary and unique keys. It also does not work for MyISAM tables. 
As a result you may expect significant performance regression for large LOAD DATA INFILE ... REPLACE INTO TABLE.., statements if you add second unique key to a table.

How to repeat:
Note the "x rows affected" for REPLACE statement in each of following examples:

CREATE TABLE `uk_pk2` (
  `id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql [localhost] {msandbox} (test) > insert into uk_pk2 values (1),(2),(3);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql [localhost] {msandbox} (test) > select * from uk_pk2;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
+----+
3 rows in set (0.00 sec)

mysql [localhost] {msandbox} (test) > replace into uk_pk2 values (1);
Query OK, 2 rows affected (0.00 sec)

------------------------------------------------------------------------

mysql [localhost] {msandbox} (test) > CREATE TABLE pk2 (id int primary key);
Query OK, 0 rows affected (0.01 sec)

mysql [localhost] {msandbox} (test) > insert into pk2 values (1),(2),(3);
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql [localhost] {msandbox} (test) > select * from uk_pk2;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
+----+
3 rows in set (0.00 sec)

mysql [localhost] {msandbox} (test) > replace into pk2 values (1);
Query OK, 1 row affected (0.00 sec)

------------------------------------------------------------------------

mysql [localhost] {msandbox} (test) > CREATE TABLE uk2 (id int, unique key(id));
Query OK, 0 rows affected (0.01 sec)

mysql [localhost] {msandbox} (test) > insert into uk2 values (1),(2),(3);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql [localhost] {msandbox} (test) > replace into uk2 values (1);
Query OK, 1 row affected (0.01 sec)

Suggested fix:
If it's possible - make this optimization work also for tables having two unique keys. 
Otherwise, the documentation needs clarifying. In http://dev.mysql.com/doc/refman/5.6/en/replace.html we can see:

"It is possible that in the case of a duplicate-key error, a storage engine may perform the REPLACE as an update rather than a delete plus insert, but the semantics are the same. There are no user-visible effects other than a possible difference in how the storage engine increments Handler_xxx status variables."

While this is not true - there is significant performance difference! I get much worse results for LOAD DATA INFILE ... REPLACE when two unique keys are present.

Hello Przemyslaw,

Thank you for the bug report and the test case. 
Verified as described.

Thanks,
Umesh