Bug #57880 Difference between MyISAM and InnoDB when using pk + UPDATE
Submitted: 1 Nov 2010 8:30 Modified: 25 Nov 2010 22:09
Reporter: Roel Van de Paar Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S2 (Serious)
Version:5.1.49, 5.1.52, 5.5.6rc OS:Any
Assigned to: CPU Architecture:Any

[1 Nov 2010 8:30] Roel Van de Paar
Description:
mysql> UPDATE `ti` SET `pk` = 3 WHERE `col_blob` < 'a';
ERROR 1062 (23000): Duplicate entry '3' for key 'PRIMARY'

mysql> UPDATE `tm` SET `pk` = 3 WHERE `col_blob` < 'a';
ERROR 1062 (23000): Duplicate entry '3' for key 'PRIMARY'

mysql> SELECT * FROM ti ORDER BY pk;
+----+----------+
| pk | col_blob |
+----+----------+
|  1 | 1        |
|  2 | 2        |
+----+----------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM tm ORDER BY pk;
+----+----------+
| pk | col_blob |
+----+----------+
|  2 | 2        |
|  3 | 1        |
+----+----------+
2 rows in set (0.00 sec)

How to repeat:
DROP TABLE IF EXISTS `ti`,`tm`;

CREATE TABLE `ti` (`pk` int(11) NOT NULL PRIMARY KEY,`col_blob` blob) ENGINE=InnoDB;
CREATE TABLE `tm` (`pk` int(11) NOT NULL PRIMARY KEY,`col_blob` blob) ENGINE=MyISAM;

INSERT INTO `ti` VALUES (1,'1'),(2,'2');
INSERT INTO `tm` VALUES (1,'1'),(2,'2');

UPDATE `ti` SET `pk` = 3 WHERE `col_blob` < 'a';
UPDATE `tm` SET `pk` = 3 WHERE `col_blob` < 'a';

SELECT * FROM ti ORDER BY pk;
SELECT * FROM tm ORDER BY pk;
[1 Nov 2010 9:43] MySQL Verification Team
i think myisam doesn't support atomic operations. so the update is partially done.. bug 51193 has some details.
[1 Nov 2010 9:43] Roel Van de Paar
Cause: "A duplicate-key error rolls back the SQL statement, if you have not specified the IGNORE option in your statement."
http://dev.mysql.com/doc/refman/5.1/en/innodb-error-handling.html

Question: Should all storage engine's (transactional or not) react the same given the same input, especially if only a single statement is involved?

Related page: http://dev.mysql.com/doc/refman/5.1/en/ansi-diff-transactions.html (Transactions and Atomic Operations)
[1 Nov 2010 19:31] Roel Van de Paar
Hmmm. It looks like the issue is really with blobs. Same testcase, but using an integer instead of blob works:

mysql> SELECT * FROM ti ORDER BY pk;
+----+---------+
| pk | col_int |
+----+---------+
|  1 |       1 |
|  2 |       2 |
+----+---------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM tm ORDER BY pk;
+----+---------+
| pk | col_int |
+----+---------+
|  1 |       1 |
|  2 |       2 |
+----+---------+
2 rows in set (0.00 sec)

--------------
DROP TABLE IF EXISTS `ti`,`tm`;

CREATE TABLE `ti` (`pk` int(11) NOT NULL PRIMARY KEY,`col_int` int) ENGINE=InnoDB;
CREATE TABLE `tm` (`pk` int(11) NOT NULL PRIMARY KEY,`col_int` int) ENGINE=MyISAM;

INSERT INTO `ti` VALUES (1,'1'),(2,'2');
INSERT INTO `tm` VALUES (1,'1'),(2,'2');

UPDATE `ti` SET `pk` = 3 WHERE `col_int` < 'a';
UPDATE `tm` SET `pk` = 3 WHERE `col_int` < 'a';

SELECT * FROM ti ORDER BY pk;
SELECT * FROM tm ORDER BY pk;
--------------
[2 Nov 2010 6:49] Roel Van de Paar
ORDER BY makes no difference;

UPDATE `ti` SET `pk` = 3 WHERE `col_blob` < 'a' ORDER by pk;
UPDATE `tm` SET `pk` = 3 WHERE `col_blob` < 'a' ORDER by pk;

mysql> SELECT * FROM tm ORDER BY pk;
+----+----------+
| pk | col_blob |
+----+----------+
|  2 | 2        |
|  3 | 1        |
+----+----------+
2 rows in set (0.00 sec)
[2 Nov 2010 9:50] Roel Van de Paar
As Shane pointed out, int testcase is flawed as no rows are touched. Not related to blobs, same result with text/char/varchar column.
[2 Nov 2010 9:51] Roel Van de Paar
So, in short, this questions remains: "Should all storage engine's (transactional or not) react the same given the same input, especially if only a single statement is involved?"
[2 Nov 2010 9:52] MySQL Verification Team
so you're saying that you want atomic updates for myisam, so that it either fails or succeeds.  currently, a partial update will leave table in semi-changed state.

drop table if exists t;
create table t(a int primary key)engine=myisam;
insert into t values (1),(2);
update t set a=3;
select * from t;
[25 Nov 2010 22:09] Roel Van de Paar
Decided to close this bug/FR after a chat with Matthias. He got me to see that if we made MyISAM atomic, it wouldn't be MyISAM anymore. 

*) dev would spend quite a bit of time/money developing this, but to what effect? (and InnoDB already being there, customers can easily swap)

*) Why add additional overhead to a storage engine that works well in given use cases?