Bug #25544 Multiple column unique key allows multiple rows?
Submitted: 11 Jan 2007 14:25 Modified: 24 Feb 2014 9:37
Reporter: Andre Timmer Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.18 OS:Any
Assigned to: Sveta Smirnova CPU Architecture:Any
Tags: unique key

[11 Jan 2007 14:25] Andre Timmer
Description:
Multiple column unique key allows multiple rows?

If result in 'How to repeat' is alright this means that unique key constraints on more than one column are dangerous.

How to repeat:
CREATE TABLE `test` (
  `id` int(11) NOT NULL auto_increment,
  `email` varchar(200) collate utf8_bin default NULL,
  `url` varchar(200) collate utf8_bin default NULL,
  `url_method` enum('GET','POST') collate utf8_bin default NULL,
  `converter` varchar(255) collate utf8_bin default NULL,
  `description` varchar(80) collate utf8_bin default NULL,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `service_uk` (`email`,`url_method`,`url`,`converter`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

insert into test (email, url_method, url, converter) values  ('test@bkwi.net', NULL  , NULL, NULL);
insert into test (email, url_method, url, converter) values  ('test@bkwi.net', NULL  , NULL, 'email.xslt');
insert into test (email, url_method, url, converter) values  (NULL           , 'POST', 'http://www.dummy.az', NULL);

-- duplicate row, violatesi unique key service_uk
insert into test (email, url_method, url, converter) values  (NULL           , 'POST', 'http://www.dummy.az', NULL);

commit;

select `email`,`url_method`,`url`,`converter`, count(*) 
from test group by 1,2,3,4
having count(*) > 1;

+-------+------------+---------------------+-----------+----------+
| email | url_method | url                 | converter | count(*) |
+-------+------------+---------------------+-----------+----------+
| NULL  | POST       | http://www.dummy.az | NULL      |        2 |
+-------+------------+---------------------+-----------+----------+

Suggested fix:
Fix.
[11 Jan 2007 14:42] Sveta Smirnova
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://dev.mysql.com/doc/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

According to http://dev.mysql.com/doc/refman/5.0/en/create-index.html:

A UNIQUE index creates a constraint such that all values in the index must be distinct. An error occurs if you try to add a new row with a key value that matches an existing row. This constraint does not apply to NULL  values except for the BDB storage engine. For other engines, a UNIQUE index allows multiple NULL values for columns that can contain NULL.
[11 Jan 2007 15:03] Andre Timmer
You write 'This constraint does not apply to NULL  values except.'

The 'duplicate' row mentioned also contains NOT null values!
Are you sure this is not a bug?
[11 Jan 2007 15:27] Andre Timmer
See last comment.
[11 Jan 2007 15:31] Alexander Keremidarski
Hello,

As Sveta explained and the manual states UNIQUE indexes allow multiple NULL values. This means all NULLs ar considered distinct "values".

Since both 

A. NULL = NULL -> False
B. NULL != NULL -> False

it is possible to interpret NULLs in an index two ways. 

In PRIMARY KEY NULLs are not allowed so we can say that for PRIMARY KEY MySQL treats NULLs as equal while in UNIQUE keys NULLs are considered distinct.
 
Such behaviour is conscious design decision and therefore not bug. This way MySQL allows one to choose between PRIMARY KEY and UNIQUE KEY depending on application needs. 

It doesn't worth debating whether it is theoretically correct or wrong because the RDBMS theory is already violated by allowing secondary unique keys.

Consider following example:

mysql> CREATE TABLE u (id INT NULL, UNIQUE KEY(id));
Query OK, 0 rows affected (0.03 sec)

mysql> INSERT INTO u VALUES (NULL), (NULL), (NULL);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

# Note there is no error nor warning above

mysql> SELECT DISTINCT id FROM u;
+------+
| id   |
+------+
| NULL |
| NULL |
| NULL |
+------+
3 rows in set (0.00 sec)
[11 Jan 2007 15:33] Alexander Keremidarski
By the way thak you a lot for the excellent bug report although we may not find consensus whether it is a bug or not.
[11 Jan 2007 16:01] Andre Timmer
Hopefully you don't find me anonying yet.

For a unique key on one column this is normal behaviour.

But if null values are treated in a distinct way in a unique key then the original example which uses MULTIPLE columns this should not be allowed??
[11 Jan 2007 16:21] Sveta Smirnova
>But if null values are treated in a distinct way in a unique key then the
>original example which uses MULTIPLE columns this should not be allowed??

Why do you think so?
[11 Jan 2007 17:17] Heikki Tuuri
Andre,

http://dev.mysql.com/doc/refman/5.0/en/create-table.html

"
A UNIQUE index creates a constraint such that all values in the index must be distinct. An error occurs if you try to add a new row with a key value that matches an existing row. This constraint does not apply to NULL values except for the BDB storage engine. For other engines, a UNIQUE index allows multiple NULL values for columns that can contain NULL. 
"

This is as specified in the manual.

Regards,

Heikki
[12 Jan 2007 10:17] Andre Timmer
Answer to question 'Why do you think so?'

=== Tested against ===
SQL*Plus: Release 8.1.7.0.0 - Production on Fri Jan 12 10:58:20 2007

(c) Copyright 2000 Oracle Corporation.  All rights reserved.

=== Test I ===
create table test (col1 varchar2(10) null);
alter table test add constraint test_uk unique (col1);

insert into test (col1) values (null);
insert into test (col1) values (null);
insert into test (col1) values ('Amsterdam');
commit;

--> OK, MySQL & Oracle have the same behaviour

=== Test II ===
create table test2 (col1 varchar2(10) null, col2 varchar2(10) null);
alter table test2 add constraint test2_uk unique (col1, col2);

insert into test2 (col1, col2) values (null, null);
insert into test2 (col1, col2) values (null, null);
insert into test2 (col1, col2) values ('Amsterdam', null);
insert into test2 (col1, col2) values ('New York' , null);
commit;

SQL> select * from test2;

COL1       COL2
---------- ----------

Amsterdam
New York

insert into test2 (col1, col2) values ('New York' , null);

SQL> insert into test2 (col1, col2) values ('New York' , null);
insert into test2 (col1, col2) values ('New York' , null)
*
ERROR at line 1:
ORA-00001: unique constraint (SA.TEST2_UK) violated

--> NOT OK / difference, MySQL & Oracle have a different behaviour

So either Oracle or MySQL doesn't behave correctly in this last test or this is a grey area in the SQL specs and both situations are allowed.
[15 Jan 2007 11:12] Sveta Smirnova
Below is quote from the draft of the INTERNATIONAL STANDARD of SQL 2003:

4.6.6.3 Table constraints 
A table constraint is an integrity constraint associated with a single base table. 
A table constraint is either a unique constraint, a primary key constraint, a referential constraint, or a check constraint. 
A unique constraint specifies one or more columns of the table as unique columns. A unique constraint is satisfied if and only if no two rows in a table have the same non-null values in the unique columns. 

As you see, MySQL documented behaviour is correct.
[15 Jan 2007 11:42] Andre Timmer
That's just the point MySQL doesn't meet "A unique constraint is satisfied if and only if no two rows in a table
have the same non-null values in the unique columns."

See the beginning of the call with the piece of output below:
+-------+------------+---------------------+-----------+----------+
| email | url_method | url                 | converter | count(*) |
+-------+------------+---------------------+-----------+----------+
| NULL  | POST       | http://www.dummy.az | NULL      |        2 |
+-------+------------+---------------------+-----------+----------+

Please read what's written not what you think i've written!!!!!!!!!!
(this call is about a unique key constraint with MORE THAN 1 column)
[25 Jan 2007 19:55] Sveta Smirnova
I have again checked the ANSI/ISO standard document and didn't find any rule to change NULL handling if such values are in UNIQUE column.
[7 Aug 2009 12:15] Jan Bols
Hi, 
regardless of what the exact interpretation of the SQL specs is, the use of a unique constraint spanning multiple columns is to enforce unique values.

So keeping that in mind, it would be incorrect to allow two rows like the following (consider there's a unique constraint on col1 and col2).

+----+------+------+
| id | col1 | col2 |
+----+------+------+
| 1  | NULL | val1 |
| 2  | NULL | val1 |
+----+------+------+

Both rows have the same non-null values (val1) and still they are considered as different in mysql.

In practice this makes a unique constraint on multiple columns completely useless once any of those columns is allowed to have null values. Nevertheless there are still a lot of valid use cases for these kind of situations.

That's why I would still consider this a bug.
[10 Aug 2009 3:37] Jonathan Wilson
I agree with Jan. 
This is effectively making the unique constraint useless for many applications. Namely an application where a column represents an optional (nullable) value.
The only workarounds I can think of are:
 1. Put a dummy value in the columns instead of null (and make it non nullable). Or reference a dummy "empty" entry if column is a foreign key to another table. 
 2. Manual checks prior to insertion.
Neither are great options.

I can see it depends a bit how the SQL spec is interpreted.
This issue has caught out quite a few people I know - so its behaviour is definately not intuitive anyway.
[22 Oct 2009 21:58] Oldřich Jedlička
I have use case for the implemented behaviour. I have two tables (page, page_history), the page_history has some columns (page_id refers to page table, revision is an integer). I want to create the page_history first for the first revision and then the page content. This means I can have more rows with page_id=NULL and revision=1 at the same time (different uncreated pages), but still when I create the real "page" and set the page_id, the combination page_id-revision has to be unique.

This is expected behaviour to me, because NULL values are not equal (even in unique constraints).
[7 Dec 2009 19:22] Peter Gulutzan
See also Bug#17825.
[16 Mar 2010 19:31] Kyle Pinette
This is absolutely ridiculous.  There should at least be a configuration option for this problem.  I understand that the spec says a unique constraint should work in this way, but it doesn't make sense in most cases.  I have duplicate rows in my database because of this problem.  I cannot use any value other than NULL due to the fact that I am using Hiberante.  If I use a value that is non-null that doesn't reference an actual key, it throws an exception.

Also, if this is the official stance of MySQL, your documentation contradicts what you've said.

"When using DISTINCT, GROUP BY, or ORDER BY, all NULL values are regarded as equal."

Taken from http://dev.mysql.com/doc/refman/5.0/en/problems-with-null.html.

So, in one place MySQL regards NULL values as equal, in another they are regarded as distinct...does that make sense?
[26 Apr 2011 13:22] Karsten Wutzke
+1 for the comments made, especially the previous one. It's really inconsistent to the bone. PLEASE FIX.
[28 Jun 2011 11:09] [ name withheld ]
+1 

I also think this behaviour is quite ridiculous and limits the possibilities of working with UNIQUE indexes.

A fix or at least a configurable setting would be very appreciated.
[28 Jun 2011 11:43] Andre Timmer
On "[12 Jan 2007 11:17]" i commented that the interpretation of Oracle database is different (read: as expected). This didn't make an impression.

Anyone knows how SQL Server, PostgreSQL or e.g. SQLite handles this?
[30 Jun 2011 7:59] Bernie van't Hof
I've recently fallen foul of the issue too and wasted quite some time coming to grips with the current implementation.

The spec is said to say "no two rows in a table have the same non-null values in the unique columns".

For mine this clearly means the logic deciding if a row is unique, or not, shall examine only the fields containing NON-NULL values. I can see no way it could be interpreted differently. And imho the spec makes perfect sense.

A unique constraint is a different animal to a primary key, and they have different reasons for existing.

+1 to get this addressed and fixed ASAP. Workaround logic is unnecessarily ugly.

One consideration will be that changing the status quo will impact many users, but is that true? How many users would be negatively impacted by a fix, compared to the vast majority that would benefit?
[22 May 2013 17:00] David Horvath
MySQL's behavior is correct in most cases. For example, if you have a unique key with two columns, the first column can be interpreted as a "namespace". The second column must be unique in any namespaces. Multiple NULL values in a namespace is allowed, remember back to a single column rule. This is valid:

namespace | name
----------+------
ns1       | NULL
ns1       | NULL
ns1       | test
ns2       | test

However... Rule of incomparable NULL-values also means: no comparison can be applied with rows with at least one NULL value, so rows with NULL value(s) are always allowed. This is not very good. I think, this would be incorrect:

namespace | name
----------+------
NULL      | NULL
NULL      | test
NULL      | test
ns1       | test
ns2       | test

In this table are two non-namespaced element with the same name.

My idea: a left-hand rule for comparison. NULL values at the right side in the key part of a row make the row unique in all cases, but if the last key field is non-NULL, row would be comparable.

Please neglect my imperfect English.
[21 Feb 2014 18:57] Yuriy Rudenko
In case anyone is still looking for a workaround, this is what I did:
Say you have a table my_tbl with columns val1,val2,val3 that should be unique but can be NULL.
First you need to create a unique ID column (PRIMARY) if you don't already have one (which you should anyway).
Then create a BEFORE INSERT trigger on the table like so:

CREATE TRIGGER `my_tbl_BINS` BEFORE INSERT ON `my_tbl` FOR EACH ROW
BEGIN
SET NEW.id=(SELECT id FROM my_tbl WHERE val1<=>NEW.val1 AND val2<=>NEW.val2 AND val3<=>NEW.val3);
END

Example using a UNIQUE KEY:

mysql> CREATE TABLE my_tbl (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,val1 INT,val2 INT, val3 INT); Query OK, 0 rows affected (0.06 sec)
mysql> ALTER TABLE my_tbl ADD CONSTRAINT my_key UNIQUE (val1,val2,val3); Query OK, 0 rows affected (0.05 sec) Records: 0  Duplicates: 0  Warnings: 0
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (NULL,NULL,NULL); Query OK, 1 row affected (0.02 sec)
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,NULL,NULL);    Query OK, 1 row affected (0.01 sec)
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,2,NULL);       Query OK, 1 row affected (0.02 sec)
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,2,3);          Query OK, 1 row affected (0.01 sec)
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,2,3);          ERROR 1062 (23000): Duplicate entry '1-2-3' for key 'my_key'
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,2,NULL);       Query OK, 1 row affected (0.01 sec)
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,NULL,NULL);    Query OK, 1 row affected (0.01 sec)
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (NULL,NULL,NULL); Query OK, 1 row affected (0.01 sec)
mysql> SELECT * FROM my_tbl ORDER BY id ASC;
+----+------+------+------+
| id | val1 | val2 | val3 |
+----+------+------+------+
|  1 | NULL | NULL | NULL |
|  2 |    1 | NULL | NULL |
|  3 |    1 |    2 | NULL |
|  4 |    1 |    2 |    3 |
|  6 |    1 |    2 | NULL |
|  7 |    1 | NULL | NULL |
|  8 | NULL | NULL | NULL |
+----+------+------+------+
7 rows in set (0.00 sec)

Now with workaround trigger instead of unique key:

CREATE TABLE my_tbl (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,val1 INT,val2 INT, val3 INT); Query OK, 0 rows affected (0.03 sec)
mysql> DELIMITER $$
mysql> CREATE TRIGGER `my_tbl_BINS` BEFORE INSERT ON `my_tbl` FOR EACH ROW
    -> BEGIN
    -> SET NEW.id=(SELECT id FROM my_tbl WHERE val1<=>NEW.val1 AND val2<=>NEW.val2 AND val3<=>NEW.val3);
    -> END$$
Query OK, 0 rows affected (0.00 sec)
mysql> DELIMITER ;
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (NULL,NULL,NULL); Query OK, 1 row affected (0.01 sec)
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,NULL,NULL);    Query OK, 1 row affected (0.01 sec)
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,2,NULL);       Query OK, 1 row affected (0.02 sec)
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,2,3);          Query OK, 1 row affected (0.02 sec)
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,2,3);          ERROR 1062 (23000): Duplicate entry '4' for key 'PRIMARY'
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,2,NULL);       ERROR 1062 (23000): Duplicate entry '3' for key 'PRIMARY'
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (1,NULL,NULL);    ERROR 1062 (23000): Duplicate entry '2' for key 'PRIMARY'
mysql> INSERT INTO my_tbl (val1,val2,val3) VALUES (NULL,NULL,NULL); ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'
mysql> SELECT * FROM my_tbl; 
+----+------+------+------+
| id | val1 | val2 | val3 |
+----+------+------+------+
|  1 | NULL | NULL | NULL |
|  2 |    1 | NULL | NULL |
|  3 |    1 |    2 | NULL |
|  4 |    1 |    2 |    3 |
+----+------+------+------+
4 rows in set (0.00 sec)
[24 Feb 2014 9:37] Andre Timmer
Clever workaround/solution. You also made me aware of the <=> operator, didn't now it existed!
[24 Feb 2014 17:28] Yuriy Rudenko
I should have mentioned that this workaround is only possible due to the <=> operator, which I also found only recently. This is a null-safe comparison operator that treats NULLs the same, which directly contradicts MySQLs' stance on NULL handling in unique keys.

There should at least be an option to have a NULL-safe unique key, as others have mentioned in various bug reports.

Other bug reports for this issue:
http://bugs.mysql.com/bug.php?id=4140
http://bugs.mysql.com/bug.php?id=5685
http://bugs.mysql.com/bug.php?id=8173
http://bugs.mysql.com/bug.php?id=27019