Bug #98600 Optimize table fails with duplicate entry on UNIQUE KEY
Submitted: 14 Feb 2020 11:48 Modified: 14 Feb 2020 12:23
Reporter: Jericho Rivera Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.7,8.0, 5.7.29, 8.0.19 OS:Any
Assigned to: CPU Architecture:Any

[14 Feb 2020 11:48] Jericho Rivera
Description:
An optimize table or its equivalent ALTER TABLE can fail with duplicate entry on a UNIQUE KEY.
+--------------+----------+----------+-------------------------------------------------------------------+
| Table        | Op       | Msg_type | Msg_text                                                          |
+--------------+----------+----------+-------------------------------------------------------------------+
| test.testddl | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| test.testddl | optimize | error    | Duplicate entry '100' for key 'testddl.name'                      |
| test.testddl | optimize | status   | Operation failed                                                  |
+--------------+----------+----------+-------------------------------------------------------------------+

This is easily reproducible following the how-to-repeat procedure. Note that the dowhile() procedure is ongoing or can be ran for a short while then execute the insert on T2 even if it is continuously failing with Duplicate entry. If T2 loop is cancelled the optimize table (T3) will not fail in succeeding loops.

Failures can happen more with 5.7 but still can be reproduced in 8.0 latest as well.

We're aware of online DDL limitations but the ease of reproducing the optimize table failure looks like a bug.

How to repeat:
Terminal 1:
CREATE TABLE `testddl` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) DEFAULT NULL,
`empid` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
Unique key(name)
) ENGINE=InnoDB;

delimiter //
CREATE PROCEDURE dowhile()
BEGIN
DECLARE i INT DEFAULT 1;
WHILE (i <= 2000000) DO
INSERT INTO `testddl` (id, name, empid) values (i, i, i);
SET i = i+1;
END WHILE;
END
//
delimiter ;

call dowhile();

Terminal 2:
while true; do mysql -uroot -e "insert into test.testddl (name,empid) values (100,100)"; sleep 1; done;

Terminal 3:
while true; do mysql -uroot -e "optimize table test.testddl"; sleep 1; done
[14 Feb 2020 12:23] MySQL Verification Team
Hello Jericho,

Thank you for the report and test case.
Verified as described with 8.0.19 build.

regards,
Umesh
[2 Mar 2020 21:01] Sveta Smirnova
See also analysis at https://jira.percona.com/browse/PS-6858?focusedCommentId=250975&page=com.atlassian.jira.pl...
[16 May 2021 18:03] Kim Miesse
This affects us also. Is the only work around to drop the unique key during the OPTIMIZE?
[12 Aug 2021 22:14] Jean-François Gagné
This is probably related to Bug#76895, which was closed as not a bug, but which I do not agree is not a bug.

From [1], we can read:

> For InnoDB tables, OPTIMIZE TABLE is mapped to ALTER TABLE ... FORCE, which rebuilds the table to update index statistics and free unused space in the clustered index.

[1]: https://dev.mysql.com/doc/refman/8.0/en/optimize-table.html

And from [2], we can read:

> Running ALTER TABLE tbl_name ENGINE=INNODB on an existing InnoDB table performs a “null” ALTER TABLE operation, which can be used to defragment an InnoDB table, as described in Section 15.11.4, “Defragmenting a Table”. Running ALTER TABLE tbl_name FORCE on an InnoDB table performs the same function.

> ALTER TABLE tbl_name ENGINE=INNODB and ALTER TABLE tbl_name FORCE use online DDL. 

[2]: https://dev.mysql.com/doc/refman/8.0/en/alter-table.html

And from [3], we can read:

> When running an in-place online DDL operation, the thread that runs the ALTER TABLE statement applies an online log of DML operations that were run concurrently on the same table from other connection threads. When the DML operations are applied, it is possible to encounter a duplicate key entry error (ERROR 1062 (23000): Duplicate entry), even if the duplicate entry is only temporary and would be reverted by a later entry in the online log. This is similar to the idea of a foreign key constraint check in InnoDB in which constraints must hold during a transaction.

[3]: https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-limitations.html

A more simple reproduction test-case is below.  Using "ALTER TABLE t ENGINE=InnoDB" instead of "OPTIMIZE TABLE t" also reproduces the problem.

A workaround can be to run "SET old_alter_table = ON" before the "OPTIMIZE TABLE", but then the OPTIMIZE TABLE will BE BLOCKING.  For a non-blocking solution, you can use online schema change from the Percona Toolkit (pt-osc) or GitHub's online schema migration (Gh'ost).

#########################
# Reproduction test-case.

# Create a sandbox:
dbdeployer deploy single mysql_8.0.26

# Initialize the test environment:
./use <<< "
  CREATE DATABASE test_jfg;
  CREATE TABLE test_jfg.t(
    id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
    v INT, v2 INT,
    UNIQUE KEY(v))"
 
# Make rows big to be able to quickly generate a big table (each row is close to 8 KB):
seq -f "ALTER TABLE t add column c%02.0f CHAR(255) DEFAULT '';" 1 30 | ./use test_jfg

# Make commit fast so loading is not too long:
./use <<< "set global sync_binlog = 0, global innodb_flush_log_at_trx_commit = 0"

# Fill the table with about 1 GB of data:
seq -f "INSERT INTO t (v) VALUES (%0.0f);" 1 $((1024/8*1024)) | ./use test_jfg

( sleep 1; ./use test_jfg <<< "INSERT INTO t (v) VALUES (1)"; echo "INSERT $(date)"; ) &   ./use test_jfg <<< "OPTIMIZE TABLE t"; echo "ALTER $(date)"
[1] 6841
ERROR 1062 (23000) at line 1: Duplicate entry '1' for key 't.v'
INSERT Thu Aug 12 21:58:27 UTC 2021
Table   Op      Msg_type        Msg_text
test_jfg.t      optimize        note    Table does not support optimize, doing recreate + analyze instead
test_jfg.t      optimize        error   Duplicate entry '1' for key 't.v'
test_jfg.t      optimize        status  Operation failed
[1]+  Done                    ( sleep 1; ./use test_jfg <<< "INSERT INTO t (v) VALUES (1)"; echo "INSERT $(date)" )
ALTER Thu Aug 12 22:00:00 UTC 2021
[12 Aug 2021 22:32] Jean-François Gagné
Related: Bug#104608.
[15 Aug 2021 17:57] Jean-François Gagné
related: Bug#104626.
[4 Jul 22:36] MySQL Verification Team
related: https://bugs.mysql.com/bug.php?id=115511