Bug #17825 Request: UNIQUE index which treats NULL as a unique value
Submitted: 1 Mar 2006 14:15 Modified: 17 Dec 2010 16:58
Reporter: Erland Lewin Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DDL Severity:S4 (Feature request)
Version: OS:Any
Assigned to:
Triage: Needs Triage: D5 (Feature request)

[1 Mar 2006 14:15] Erland Lewin
Description:
Currently, a UNIQUE index with columns which allow NULL values will allow any number of rows with NULL values.

See bug #8173, and the user comment on "13.1.4 CREATE INDEX" syntax in the manual.

There are work-arounds, such as adding an extra field which is set when the value is NULL, but that is ugly and requires code to maintain.

I would like to request that an index type or attribute which does not allow multiple identical rows with NULL values be added. 

I'm sure wanting this behaviour is not uncommon.

How to repeat:
See bug #8173
[1 Mar 2006 14:43] Valerii Kravchuk
Thank you for a feature request. I am not sure that this feature should be impelemented, though. All NULL values should be treated as different ones (we don't know for sure). Need a quote from SQL 2003 Standard?
[26 May 2006 10:34] Tony Mountifield
How about a quote from the long-standing SQL-92 standard?

"A unique constraint is satisfied if and only if no two rows in a table have the same non-null values in the unique columns. In addition, if the unique constraint was defined with PRIMARY KEY, then it requires that none of the values in the specified column or columns be the null value."

That suggests that the current MySQL behaviour is wrong and needs to be fixed (In MySQL 4 too).
[26 May 2006 10:46] Tony Mountifield
So you should be able to have any number of rows where all the columns in a
unique index are null, but only one row with a given combination of non-null
columns.

create table tb (
  a int,
  b int,
  c int,
  unique index (a,b,c)
);

insert into tb(a,b,c) values (null,null,null); -- ok
insert into tb(a,b,c) values (null,null,null); -- still ok
insert into tb(a,b,c) values (null,null,null); -- still ok
insert into tb(a,b,c) values (1,null,null); -- ok
insert into tb(a,b,c) values (1,2,null); -- ok
insert into tb(a,b,c) values (1,2,3); -- ok
insert into tb(a,b,c) values (1,null,null); -- SHOULD FAIL, BUT DOESN'T
insert into tb(a,b,c) values (1,2,null); -- SHOULD FAIL, BUT DOESN'T
insert into tb(a,b,c) values (1,2,3); -- fails correctly
[22 Jun 2006 9:12] Sergei Golubchik
see bug#8173 for the answer
[5 Dec 2006 17:00] Darren Powers
I have also had this issue and think it would be great if there was at least the option to create a unique key that would cater for this.
[7 May 2008 11:53] Abhilesh Khatri
May be there is a typo error, that is being misunderstood by different people. 

How I perceive this issue is like MySQL is standing out as an exception. Where as all other top vendors like Oracle, SQL Server, IBM etc... do what we expect i.e. Give back an error if a specific index columns has the null value.
[7 Jul 2009 20:34] Timothy Burrington
I ran into this behavior the other day and would also like to express my interest in this being made into at least a configuration option.
  I've checked the same behavior against Sybase ASE 12.5/15.0, IBM DB2 v9, and Oracle 11g and all of them throw errors under the conditions described above.  This does set MySQL apart from the rest in a negative way (you tend to get 'bit' by this behavior).  With a configuration option at least you could say that it meets the standard but can be changed to behave as everybody expects it to.
[7 Dec 2009 19:19] Peter Gulutzan
We have three bug reports, marked "not a bug".
Bug#8173  unique index allows duplicates with null values
Bug#16219 Unique index not unique when containing NULL columns
Bug#25544 Multiple column unique key allows multiple rows?

We have one feature request, marked "won't fix".
Bug#17825 Request: UNIQUE index which treats NULL as a unique value

They're not exact duplicates, but could be tracked together.
[8 Dec 2009 8:26] Valter Kungla
OK I see from that bug (http://bugs.mysql.com/bug.php?id=25544) that because in MySQL NULL != NULL (SELECT NULL = NULL   return NULL)

But is this in SQL standard or is there any other reason why null isn't null. IN programming langages null == null gives me true. Why is this behavior different in database?
[21 Dec 2009 10:49] Yann-Gael GAUTHERON
I agree with everybody knows other databases, it's a shame for MySQL.
It's the first negative point ever. "Won't fix" is very bad, i can't understand such answer on a problem like that.

Going to check if somebody already fixed that with a diff source. If not, i'll fix it by myself and recompile my gentoo builds and share it, instead of MySQL AB.

Thanks you for all.
[25 Jan 2010 21:46] Peter Gulutzan
> But is this in SQL standard or is there any other reason
> why null isn't null. IN programming langages null == null gives me true.
> Why is this behavior different in database?

Although the standard doesn't specify for indexes, it does specify for
uniqueness, and MySQL is following the standard.

In other languages, NULL means something totally different,
don't be confused by the name.
[27 Jan 2010 16:30] Brandon Kearby
When a unique key constraint contains a nullable column, the current implementation makes the unique key useless. It would be nice if MySQL implemented it like the rest of the guys.
[15 Dec 2010 19:12] Mike Mosiewicz
I have just found the discussion on this bug. I just wanted to support people stating it's a bug. The interpretation of SQL 2003 standard is rudiculous. It strictly says that non-null column must be different. 

Where is this voting button?
[17 Dec 2010 16:51] Markus Bäurle
Vote +1
[21 Feb 2011 0:24] Christian Fruth
Vote +1
[21 Feb 2011 7:43] Valter Kungla
Vote +1

Have been waiting for this over a year.
Please add this feature - it makes develpers life much easyer
[21 Feb 2011 17:06] chris Lininger
I also believe this must be a bug.  My interpretation of "no two rows in a table have the same non-null values in the unique columns" is different from the reading proposed earlier.

If I have a row that has *any* non-null value in the index, that non-null value must be checked for uniqueness, even is all the other index columns are null.

So if I have an existing record indexed as (1, null, null) in a table, then to insert the same (1, null, null) row must fail.  The index column which is non-null (i.e. the first column) collides with an existing row, and there is no further data to distinguish it.

However, if I attempt to insert (1, 2, null), this should succeed because I now have an additional non-null column which makes this record unique.

The only combination which must allow multiple entries is (null, null, null).
[12 Jun 2011 6:02] Lisias Toledo
Vote +1

Being the only guy in the block doing different is not a wise move. Makes migration to MySql a bit less straightfoward.

And my life as a data modeler a bit more bitter.
[21 Jun 2012 12:54] Abdullah Battal
This feature request is 6 years old??? wow. I don't believe mysql ever fixes this. But still... vote +1
[22 Jun 2012 8:29] Valter Kungla
PLEASE, CAN SOMEONE NOTICE US! WE REALLY NEED IT!
[30 Jul 2012 14:14] Florian Bantner
Yea right. Everyone else who is making an unique index an unique index is holding ... aeh ... doing it wrong. +1
[28 Aug 2012 14:30] Roman Panov
Vote +1
[9 Oct 2012 21:40] Zilvinas Bartkus
Vote +1
[5 Feb 2013 16:28] Dan Straw
Vote +1
[24 Oct 2013 12:17] Umesh Shastry
Bug #70718 marked as duplicate of this one.
[21 Feb 2014 18:07] Yuriy Rudenko
Vote +1
[20 Feb 2015 10:31] Rolf Neuberger
The absence of a "UNIQUE_SERIOSLY" index type throws a wrench into self-referential tree tables.
Consider a table structure like this:

 CREATE TABLE `Node` (
  `nodeId` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `parentNodeId` INT UNSIGNED DEFAULT NULL, # NULL means no parent <=> root node
  `name` varchar(63) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`nodeId`),
  UNIQUE INDEX (`parentNodeId`, `name`),    # <= this right here (explanation below)
  CONSTRAINT `fk_Node_Node_parentNodeId_nodeId` FOREIGN KEY (`parentNodeId`) REFERENCES `Node` (`nodeId`) ON DELETE CASCADE ON UPDATE CASCADE,
) ENGINE=InnoDB;

What we have here is a working single-table tree storage where the foreign key (recursively!) ensures that deleting a parent node will also
delete all children. This part is working perfectly fine in MySQL 5.6 at least (self-referential FKs might be a relatively recent addition, not sure).

The rule we're trying to enforce with the compound unique index is that, while many nodes in the tree may have the same name, no two children of the same parent can have
the same name.

Making the parentNodeId non-NULLable and storing a "magic value" (like 0) breaks the foreign key definition.
Unique index behavior enforces the desired "name unique within scope of given parent" rule fine, except for root nodes.
MySQL, with the current behaviour of unique indexes, allows inserting infinite root nodes (`parentNodeId` IS NULL) with the same name.

There is no clean workaround for this constraint limitation. Using a separate pivot table only for parent-child relations, with two foreign
key constraints bouncing both ways and no nullable columns, still does not enable enforcement of the desired uniqueness constraint on the
database level. The "right" solution is to store unique parentNodeId + name redundantly in another table, except you translate NULL to 0 on insert,
and abort the insert to the real table if that fails. Keeping those two in sync is a manual process because there will be no help from foreign
keys. It's horrifying even before you consider additional limitations of common database abstraction layers in mainstream application frameworks.

So basically, +1, as a new index type. Whoever seriously wants the current behavior should not be affected at all.
[20 Feb 2015 12:04] Anthony NOT_FOUND
Rolf, in that particular case, you can add a dummy column which you set to 0 for root nodes and to the parent node ID for child nodes.
Then you add a unique index across the name and the dummy column:

CREATE TABLE `Node` (
  `nodeId` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `parentNodeId` INT UNSIGNED DEFAULT NULL, # NULL means no parent <=> root node
  `dummy` INT,
  `name` varchar(63) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`nodeId`),
  UNIQUE INDEX (`parentNodeId`, `name`),    # <= this right here (explanation below)
  UNIQUE INDEX (`dummy`, `name`),
  CONSTRAINT `fk_Node_Node_parentNodeId_nodeId` FOREIGN KEY (`parentNodeId`) REFERENCES `Node` (`nodeId`) ON DELETE CASCADE ON UPDATE CASCADE,
) ENGINE=InnoDB;

You could also have a trigger that updated `dummy` according to the setting of `parentNodeId`. Then there would be no change needed to application code:

CREATE TRIGGER `Node_ins`
BEFORE INSERT ON `Node`
FOR EACH ROW SET `dummy`=COALESCE(`parentNodeId`,0);

CREATE TRIGGER `Node_upd`
BEFORE UPDATE ON `Node`
FOR EACH ROW SET `dummy`=COALESCE(`parentNodeId`,0);
[20 Feb 2015 12:38] Rolf Neuberger
That's a nice solution Anthony, thanks for that!
[20 Feb 2015 12:48] Tony Mountifield
Glad to help.

One slight correction: I should have declared `dummy` as INT UNSIGNED NOT NULL.
[21 Jul 2015 20:56] Eric McAfee
Vote +1