Bug #71507 | innodb is excessively slow to replace duplicate values in a table | ||
---|---|---|---|
Submitted: | 29 Jan 2014 12:51 | ||
Reporter: | Shane Bester (Platinum Quality Contributor) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S5 (Performance) |
Version: | 5.7 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[29 Jan 2014 12:51]
Shane Bester
[29 Jan 2014 13:01]
MySQL Verification Team
5.7.4 overview of hotspots.
Attachment: bug71507_vtune_hotpots_overview.png (image/png, text), 93.73 KiB.
[16 Jun 2014 13:54]
MySQL Verification Team
current mysql-trunk speed (datadir on ramdisk): mysql> mysql> replace into `t1`(`a`) select `a` from `seed`; #slow Query OK, 10015 rows affected (57.01 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> replace into `t2`(`a`) select `a` from `seed`; #fast Query OK, 10015 rows affected (0.04 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> mysql> select version(); +-----------+ | version() | +-----------+ | 5.7.5-m15 | +-----------+ 1 row in set (0.00 sec)
[29 Sep 2014 11:30]
Marko Mäkelä
Posted by developer: The duplicate key error handling in InnoDB relies on the undo log and statement rollback. The test case is updating the PRIMARY KEY to a constant value 7. InnoDB will map UPDATE of a key to a DELETE and INSERT. DELETE will be like an update that updates the delete-mark bit. Actual deletion would be performed by the purge subsystem. I believe that we are doing the following: 1. Lock the record. 2. Write update_undo log for delete-marking the record. 3. Write insert_undo log for inserting the record. 4. Attempt to insert the new record. 5. Notice the duplicate. 6. Roll back the undo log written in steps 3 and 2. We could probably do a shortcut, while keeping the same file format: 1. Lock the record. 2. Check if the new record can be inserted. 3. Write update_undo log for delete-marking the record. 4. Delete-mark the old record. 5. Write insert_undo log for inserting the record. 6. Insert the record. The duplicate would be detected already at step 2. This would avoid any rollback in this case.
[3 May 2016 13:36]
MySQL Verification Team
current mysql-trunk (5.8.0) takes: mysql> replace into `t1`(`a`) select `a` from `seed`; #slow Query OK, 10015 rows affected (1 min 23.15 sec) Records: 5008 Duplicates: 5007 Warnings: 0
[27 Aug 2016 17:24]
MySQL Verification Team
Current situation: ------------------- Version: '8.0.1-dmr' socket: '' port: 3306 (Built on 2016/08/18) mysql> replace into `t1`(`a`) select `a` from `seed`; #slow Query OK, 10015 rows affected (1 min 26.11 sec) Records: 5008 Duplicates: 5007 Warnings: 0
[17 Sep 2016 18:07]
Mark Callaghan
I have been curious about a possibly related case. Performance for insert ... on duplicate key update when most rows will get the on dup key error. Hopefully we will have perf results to share soon.
[12 Oct 2016 13:33]
Justin Swanhart
@Mark - a simple demonstration of this is trying to create a list of unique URL from the wikipedia logs. You pretty much can't do it. I've always had construct a bloom filter in front of it to avoid inserting values already in the table, and just taking the hit when there is a false negative.
[16 Jan 2018 11:09]
MySQL Verification Team
5.7.21 results .............. mysql> mysql> replace into `t1`(`a`) select `a` from `seed`; #slow Query OK, 10015 rows affected (1 min 34.75 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> replace into `t2`(`a`) select `a` from `seed`; #fast Query OK, 10015 rows affected (0.07 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> mysql> select version(); +-----------+ | version() | +-----------+ | 5.7.21 | +-----------+ 1 row in set (0.00 sec)
[14 Nov 2019 20:47]
MySQL Verification Team
On a different disk (SSD) on linux, some new results: 5.7.28: -------- mysql> mysql> replace into `t1`(`a`) select `a` from `seed`; #slow Query OK, 10015 rows affected (1 min 10.16 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> replace into `t2`(`a`) select `a` from `seed`; #fast Query OK, 10015 rows affected (0.03 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> mysql> select version(); +-----------+ | version() | +-----------+ | 5.7.28 | +-----------+ 1 row in set (0.00 sec) -------------------- 8.0.18: -------------- mysql> mysql> replace into `t1`(`a`) select `a` from `seed`; #slow Query OK, 10015 rows affected (1 min 34.85 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> replace into `t2`(`a`) select `a` from `seed`; #fast Query OK, 10015 rows affected (0.05 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> mysql> select version(); +-----------+ | version() | +-----------+ | 8.0.18 | +-----------+ 1 row in set (0.00 sec) ------------ Both servers run with: ./bin/mysqld --no-defaults --basedir=. --datadir=./data \ --innodb-flush-log-at-trx-commit=0 --innodb-buffer-pool-size=4G \ --innodb-buffer-pool-instances=3 --core-file --innodb-lock-wait-timeout=1 --lock-wait-timeout=1 --innodb_flush_method=O_DIRECT_NO_FSYNC \ --innodb-purge-threads=4 --innodb_flush_neighbors=0 --skip-log-bin
[21 Jul 2020 7:31]
MySQL Verification Team
8.0.21: ---------- mysql> replace into `t1`(`a`) select `a` from `seed`; #slow Query OK, 10015 rows affected (1 min 22.34 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> replace into `t2`(`a`) select `a` from `seed`; #fast Query OK, 10015 rows affected (0.03 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> mysql> select version(); +-------------------+ | version() | +-------------------+ | 8.0.21-commercial | +-------------------+ 1 row in set (0.00 sec) ================= I also tried: alter instance disable INNODB REDO_LOG; not much difference: mysql> replace into `t1`(`a`) select `a` from `seed`; #slow Query OK, 10015 rows affected (1 min 19.40 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> replace into `t2`(`a`) select `a` from `seed`; #fast Query OK, 10015 rows affected (0.03 sec) Records: 5008 Duplicates: 5007 Warnings: 0
[30 May 2024 12:20]
MySQL Verification Team
Time to test the recent 8.4.0 release: Server on Windows run with: --innodb-buffer-pool-size=4G --skip-log-bin --innodb-flush-log-at-trx-commit=0 mysql> create table `t1`(`id` int auto_increment primary key,`a` tinyint,unique key(`a`))engine=innodb; Query OK, 0 rows affected (0.02 sec) mysql> create table `t2`(`id` int auto_increment primary key,`a` tinyint,unique key(`a`))engine=myisam; Query OK, 0 rows affected (0.01 sec) mysql> create table `seed` (id int auto_increment primary key,a tinyint)engine=myisam; Query OK, 0 rows affected (0.01 sec) mysql> mysql> insert into `seed`(`a`) values (7),(7),(7),(7),(7),(7),(7),(7); Query OK, 8 rows affected (0.01 sec) Records: 8 Duplicates: 0 Warnings: 0 mysql> insert into `seed`(`a`) select `seed`.`a` from `seed`, `seed` `t2`, `seed` `t3`,`seed` `t4`,`seed` `t5` limit 5000; Query OK, 5000 rows affected (0.12 sec) Records: 5000 Duplicates: 0 Warnings: 0 mysql> mysql> replace into `t1`(`a`) select `a` from `seed`; #slow Query OK, 10015 rows affected (1 min 39.50 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> replace into `t2`(`a`) select `a` from `seed`; #fast Query OK, 10015 rows affected (0.10 sec) Records: 5008 Duplicates: 5007 Warnings: 0 mysql> mysql> select version(); +------------------+ | version() | +------------------+ | 8.4.0-commercial | +------------------+ 1 row in set (0.00 sec)