Bug #60886 ALTER IGNORE TABLE ADD PRIMARY KEY fails on duplicate key values
Submitted: 15 Apr 2011 20:56 Modified: 16 Apr 2011 19:35
Reporter: tom worster Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.5.11 OS:MacOS (10.7.0)
Assigned to: CPU Architecture:Any

[15 Apr 2011 20:56] tom worster
Description:
With InnoDB, ALTER IGNORE TABLE ADD PRIMARY KEY (c) fails if column c contains a duplicate value and the column is defined as NOT NULL.

According to documentation, rows with duplicate values should be deleted when IGNORE is used and no error should be issued. 

How to repeat:
CREATE TABLE t (x INT NOT NULL, y INT) ENGINE=INNODB;
INSERT INTO t VALUES (1,1),(2,2),(2,3),(4,4);
ALTER IGNORE TABLE t ADD PRIMARY KEY (x);

Expected response:
4 row(s) affected
Records: 4  Duplicates: 1  Warnings: 0

Actual response:
Error Code: 1062
Duplicate entry '2' for key 'PRIMARY'

Notes: I get expected response if I remove 'NOT NULL' from column x definition. I also get expected response with ENGINE=MyISAM.

Suggested fix:
I don't know what to suggest. Sorry.
[16 Apr 2011 2:39] MySQL Verification Team
this looks like bug #40344 to me?
[16 Apr 2011 19:35] tom worster
Hi Shane Bester,

The underlying bug may well be the same but the differences in the test cases may be worth incorporating into 40344. I tried 4 tests:

This works as expected:
  DROP TABLE IF EXISTS t;
  CREATE TABLE t (x INT, y INT) ENGINE=INNODB;
  INSERT INTO t VALUES (1,1),(2,2),(2,3),(4,4);
  ALTER IGNORE TABLE t ADD PRIMARY KEY (x);

But all of the following 3 fail
  DROP TABLE IF EXISTS t;
  CREATE TABLE t (x INT NOT NULL, y INT) ENGINE=INNODB;
  INSERT INTO t VALUES (1,1),(2,2),(2,3),(4,4);
  ALTER IGNORE TABLE t ADD PRIMARY KEY (x);

  DROP TABLE IF EXISTS t;
  CREATE TABLE t (x INT, y INT) ENGINE=INNODB;
  INSERT INTO t VALUES (1,1),(2,2),(2,3),(4,4);
  ALTER IGNORE TABLE t ADD UNIQUE KEY (x);

  DROP TABLE IF EXISTS t;
  CREATE TABLE t (x INT NOT NULL, y INT) ENGINE=INNODB;
  INSERT INTO t VALUES (1,1),(2,2),(2,3),(4,4);
  ALTER IGNORE TABLE t ADD UNIQUE KEY (x);
[21 Sep 2011 22:38] Trey Raymond
Interesting behavior.  Note in test two, when I add a condition that innodb won't like, it somehow sends a warning and then processes the ignore, deleting rows.

mysql> show create table base;
+-------+--------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                 
           |
+-------+--------------------------------------------------------------------------------------------------------------------------+
| base  | CREATE TABLE `base` (
  `R` varchar(255) NOT NULL,
  `datetime` datetime NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+--------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select * from base;
+------+---------------------+
| R    | datetime            |
+------+---------------------+
| asdf | 2011-09-06 12:27:50 |
| asdf | 2011-09-13 12:28:00 |
+------+---------------------+
2 rows in set (0.00 sec)

Test 1: basic alter ignore.  Not working here due to duplicate key error.  This is likely a mysql bug, I can't think of
any other reason.

mysql> create table tmp like base;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into tmp select * from base;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> alter ignore table tmp
    ->  add primary key (R);
ERROR 1062 (23000): Duplicate entry 'asdf' for key 'PRIMARY'
mysql> select * from tmp;
+------+---------------------+
| R    | datetime            |
+------+---------------------+
| asdf | 2011-09-06 12:27:50 |
| asdf | 2011-09-13 12:28:00 |
+------+---------------------+
2 rows in set (0.00 sec)

mysql> drop table tmp;
Query OK, 0 rows affected (0.64 sec)

Test 2: combine adding pk with order by statement.  This ignored the order by as it happened after the row filter for
the add primary key due to the order of statements, and therefore didn't achieve desired outcome.

mysql> create table tmp like base;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into tmp select * from base;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> alter ignore table tmp
    ->  add primary key (R),
    ->  order by datetime desc;
Query OK, 2 rows affected, 1 warning (0.64 sec)
Records: 2  Duplicates: 1  Warnings: 0

mysql> show warnings;
+---------+------+--------------------------------------------------------------------------------+
| Level   | Code | Message                                                                        |
+---------+------+--------------------------------------------------------------------------------+
| Warning | 1105 | ORDER BY ignored as there is a user-defined clustered index in the table 'tmp' |
+---------+------+--------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql> select * from tmp;
+------+---------------------+
| R    | datetime            |
+------+---------------------+
| asdf | 2011-09-06 12:27:50 |
+------+---------------------+
1 row in set (0.00 sec)

mysql> drop table tmp;
Query OK, 0 rows affected (0.63 sec)

Test 3: The reverse of test 2, doing the order by first before putting them into the pk, so the filter keeps things in
the desired order.  This failed with a syntax error, I have no idea why.

mysql> create table tmp like base;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into tmp select * from base;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> alter ignore table tmp
    ->  order by datetime desc,
    ->  add primary key (R);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server
version for the right syntax to use near 'add primary key (R)' at line 3
mysql> select * from tmp;
+------+---------------------+
| R    | datetime            |
+------+---------------------+
| asdf | 2011-09-06 12:27:50 |
| asdf | 2011-09-13 12:28:00 |
+------+---------------------+
2 rows in set (0.00 sec)

mysql> drop table tmp;
Query OK, 0 rows affected (0.64 sec)

Test 4: This is doing the order by first and then adding the key.  This is undesirable as it is in two statements, but
if done at night on this not heavily written table we should be okay.  However, it encounters the same bug as test 1,
where for some reason it ignores my ignore and produces a duplicate key error.

mysql> create table tmp like base;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into tmp select * from base;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> alter ignore table tmp
    ->  order by datetime desc;
Query OK, 2 rows affected (0.64 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> alter ignore table tmp
    ->  add primary key (R);
ERROR 1062 (23000): Duplicate entry 'asdf' for key 'PRIMARY'
mysql> select * from tmp;
+------+---------------------+
| R    | datetime            |
+------+---------------------+
| asdf | 2011-09-13 12:28:00 |
| asdf | 2011-09-06 12:27:50 |
+------+---------------------+
2 rows in set (0.00 sec)

mysql> drop table tmp;
Query OK, 0 rows affected (0.64 sec)