Bug #38046 INSERT SELECT ON DUPLICATE KEY UPDATE, LOAD DATA REPLACE wrong lock
Submitted: 11 Jul 2008 14:22 Modified: 3 Aug 2012 15:34
Reporter: Sergei Golubchik Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:5.0, 5.1, 6.0 bzr OS:Any
Assigned to: CPU Architecture:Any

[11 Jul 2008 14:22] Sergei Golubchik
Description:
INSERT SELECT ON DUPLICATE KEY UPDATE and LOAD DATA CONCURRENT REPLACE
use TL_WRITE_CONCURRENT_INSERT though they may update/delete a row.

How to repeat:
-- Create a big table,for example:
-- many columns, mainly, to slow the table down
CREATE TABLE `t1` (
  `a1` varchar(50) default NULL,
  `a2` varchar(50) default NULL,
  `a3` varchar(50) default NULL,
  `a4` varchar(50) default NULL,
  `a5` varchar(50) default NULL,
  `b1` int(11) default NULL,
  `b2` int(11) default NULL,
  `b3` int(11) default NULL,
  `b4` int(11) default NULL,
  `b5` int(11) default NULL,
  UNIQUE KEY `b5` (`b5`),
  KEY `a1` (`a1`,`b1`,`b2`),
  KEY `a2` (`a2`,`a3`,`b3`),
  KEY `a4` (`a4`,`b5`,`a5`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

-- populate it
insert t1 values (repeat(substr(rand(), 3), 20),
                  repeat(substr(rand(), 3), 20),
                  repeat(substr(rand(), 3), 20),
                  repeat(substr(rand(), 3), 20),
                  repeat(substr(rand(), 3), 20),
                  rand()*1e7, rand()*1e7, rand()*1e7,
                  rand()*1e7, rand()*1e7);
insert t1 select repeat(substr(rand(), 3), 20),
                 repeat(substr(rand(), 3), 20),
                 repeat(substr(rand(), 3), 20),
                 repeat(substr(rand(), 3), 20),
                 repeat(substr(rand(), 3), 20),
                 rand()*1e7, rand()*1e7, rand()*1e7,
                 rand()*1e7, rand()*1e7 from t1;
-- repeat the last command until satisfaction

-- prepare a second table
create table t2 like t1;
alter table t2 add unique (b1);
insert ignore t2 select * from t1;

-- in a second window type this query but don't hit enter just yet:
select sum(b2) from t2 union select sum(b2) from t2 union select sum(b2) from t2;

-- now do the following quickly
insert t2 select * from t1 on duplicate key update set t1=NULL, t2=0;
-- and while it's running start the previous select .. union query.

As a result I get:

+---------------+
| sum(b2)       |
+---------------+
| 1402363065784 | 
| 1370330448842 | 
| 1339675797080 | 
+---------------+

A similar test could be done with load data:

load data concurrent infile 't.csv' replace into table t2;
load data concurrent infile 't.csv' replace into table t2;

while the second load data is running to the above select.

Suggested fix:
use TL_WRITE
[11 Jul 2008 17:02] Sveta Smirnova
Thank you for the report.

Verified as described with INSER query ` insert t2 select * from t1 on duplicate key update b5=NULL, b1=null;`
[11 Jul 2008 17:05] Sveta Smirnova
One should have some luck to repeat the bug.
[3 Aug 2012 15:34] Paul DuBois
Noted in 5.7.0 changelog.

INSERT INTO ... SELECT ... ON DUPLICATE KEY UPDATE and LOAD DATA
CONCURRENT REPLACE took too weak a lock, leading to the possibility
of concurrent SELECT statements returning inconsistent results.