MySQL Bugs: #17825: Request: UNIQUE index which treats NULL as a unique value

Bug #17825	Request: UNIQUE index which treats NULL as a unique value
Submitted:	1 Mar 2006 14:15	Modified:	17 Dec 2010 16:58
Reporter:	Erland Lewin	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: DDL	Severity:	S4 (Feature request)
Version:		OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
Currently, a UNIQUE index with columns which allow NULL values will allow any number of rows with NULL values.

See bug #8173, and the user comment on "13.1.4 CREATE INDEX" syntax in the manual.

There are work-arounds, such as adding an extra field which is set when the value is NULL, but that is ugly and requires code to maintain.

I would like to request that an index type or attribute which does not allow multiple identical rows with NULL values be added. 

I'm sure wanting this behaviour is not uncommon.

How to repeat:
See bug #8173

Thank you for a feature request. I am not sure that this feature should be impelemented, though. All NULL values should be treated as different ones (we don't know for sure). Need a quote from SQL 2003 Standard?

How about a quote from the long-standing SQL-92 standard?

"A unique constraint is satisfied if and only if no two rows in a table have the same non-null values in the unique columns. In addition, if the unique constraint was defined with PRIMARY KEY, then it requires that none of the values in the specified column or columns be the null value."

That suggests that the current MySQL behaviour is wrong and needs to be fixed (In MySQL 4 too).

So you should be able to have any number of rows where all the columns in a
unique index are null, but only one row with a given combination of non-null
columns.

create table tb (
  a int,
  b int,
  c int,
  unique index (a,b,c)
);

insert into tb(a,b,c) values (null,null,null); -- ok
insert into tb(a,b,c) values (null,null,null); -- still ok
insert into tb(a,b,c) values (null,null,null); -- still ok
insert into tb(a,b,c) values (1,null,null); -- ok
insert into tb(a,b,c) values (1,2,null); -- ok
insert into tb(a,b,c) values (1,2,3); -- ok
insert into tb(a,b,c) values (1,null,null); -- SHOULD FAIL, BUT DOESN'T
insert into tb(a,b,c) values (1,2,null); -- SHOULD FAIL, BUT DOESN'T
insert into tb(a,b,c) values (1,2,3); -- fails correctly

see bug#8173 for the answer

I have also had this issue and think it would be great if there was at least the option to create a unique key that would cater for this.

May be there is a typo error, that is being misunderstood by different people. 

How I perceive this issue is like MySQL is standing out as an exception. Where as all other top vendors like Oracle, SQL Server, IBM etc... do what we expect i.e. Give back an error if a specific index columns has the null value.

I ran into this behavior the other day and would also like to express my interest in this being made into at least a configuration option.
  I've checked the same behavior against Sybase ASE 12.5/15.0, IBM DB2 v9, and Oracle 11g and all of them throw errors under the conditions described above.  This does set MySQL apart from the rest in a negative way (you tend to get 'bit' by this behavior).  With a configuration option at least you could say that it meets the standard but can be changed to behave as everybody expects it to.

We have three bug reports, marked "not a bug".
Bug#8173  unique index allows duplicates with null values
Bug#16219 Unique index not unique when containing NULL columns
Bug#25544 Multiple column unique key allows multiple rows?

We have one feature request, marked "won't fix".
Bug#17825 Request: UNIQUE index which treats NULL as a unique value

They're not exact duplicates, but could be tracked together.

OK I see from that bug (http://bugs.mysql.com/bug.php?id=25544) that because in MySQL NULL != NULL (SELECT NULL = NULL   return NULL)

But is this in SQL standard or is there any other reason why null isn't null. IN programming langages null == null gives me true. Why is this behavior different in database?

I agree with everybody knows other databases, it's a shame for MySQL.
It's the first negative point ever. "Won't fix" is very bad, i can't understand such answer on a problem like that.

Going to check if somebody already fixed that with a diff source. If not, i'll fix it by myself and recompile my gentoo builds and share it, instead of MySQL AB.

Thanks you for all.

> But is this in SQL standard or is there any other reason
> why null isn't null. IN programming langages null == null gives me true.
> Why is this behavior different in database?

Although the standard doesn't specify for indexes, it does specify for
uniqueness, and MySQL is following the standard.

In other languages, NULL means something totally different,
don't be confused by the name.

When a unique key constraint contains a nullable column, the current implementation makes the unique key useless. It would be nice if MySQL implemented it like the rest of the guys.

I have just found the discussion on this bug. I just wanted to support people stating it's a bug. The interpretation of SQL 2003 standard is rudiculous. It strictly says that non-null column must be different. 

Where is this voting button?

Vote +1

Vote +1

Vote +1

Have been waiting for this over a year.
Please add this feature - it makes develpers life much easyer

I also believe this must be a bug.  My interpretation of "no two rows in a table have the same non-null values in the unique columns" is different from the reading proposed earlier.

If I have a row that has *any* non-null value in the index, that non-null value must be checked for uniqueness, even is all the other index columns are null.

So if I have an existing record indexed as (1, null, null) in a table, then to insert the same (1, null, null) row must fail.  The index column which is non-null (i.e. the first column) collides with an existing row, and there is no further data to distinguish it.

However, if I attempt to insert (1, 2, null), this should succeed because I now have an additional non-null column which makes this record unique.

The only combination which must allow multiple entries is (null, null, null).

Vote +1

Being the only guy in the block doing different is not a wise move. Makes migration to MySql a bit less straightfoward.

And my life as a data modeler a bit more bitter.

This feature request is 6 years old??? wow. I don't believe mysql ever fixes this. But still... vote +1

PLEASE, CAN SOMEONE NOTICE US! WE REALLY NEED IT!

Yea right. Everyone else who is making an unique index an unique index is holding ... aeh ... doing it wrong. +1

Vote +1

Vote +1

Vote +1

Bug #70718 marked as duplicate of this one.

Vote +1

The absence of a "UNIQUE_SERIOSLY" index type throws a wrench into self-referential tree tables.
Consider a table structure like this:

CREATE TABLE `Node` (
`nodeId` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`parentNodeId` INT UNSIGNED DEFAULT NULL, # NULL means no parent <=> root node
`name` varchar(63) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`nodeId`),
UNIQUE INDEX (`parentNodeId`, `name`), # <= this right here (explanation below)
CONSTRAINT `fk_Node_Node_parentNodeId_nodeId` FOREIGN KEY (`parentNodeId`) REFERENCES `Node` (`nodeId`) ON DELETE CASCADE ON UPDATE CASCADE,
) ENGINE=InnoDB;

What we have here is a working single-table tree storage where the foreign key (recursively!) ensures that deleting a parent node will also
delete all children. This part is working perfectly fine in MySQL 5.6 at least (self-referential FKs might be a relatively recent addition, not sure).

The rule we're trying to enforce with the compound unique index is that, while many nodes in the tree may have the same name, no two children of the same parent can have
the same name.

Making the parentNodeId non-NULLable and storing a "magic value" (like 0) breaks the foreign key definition.
Unique index behavior enforces the desired "name unique within scope of given parent" rule fine, except for root nodes.
MySQL, with the current behaviour of unique indexes, allows inserting infinite root nodes (`parentNodeId` IS NULL) with the same name.

There is no clean workaround for this constraint limitation. Using a separate pivot table only for parent-child relations, with two foreign
key constraints bouncing both ways and no nullable columns, still does not enable enforcement of the desired uniqueness constraint on the
database level. The "right" solution is to store unique parentNodeId + name redundantly in another table, except you translate NULL to 0 on insert,
and abort the insert to the real table if that fails. Keeping those two in sync is a manual process because there will be no help from foreign
keys. It's horrifying even before you consider additional limitations of common database abstraction layers in mainstream application frameworks.

So basically, +1, as a new index type. Whoever seriously wants the current behavior should not be affected at all.

Rolf, in that particular case, you can add a dummy column which you set to 0 for root nodes and to the parent node ID for child nodes.
Then you add a unique index across the name and the dummy column:

CREATE TABLE `Node` (
  `nodeId` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `parentNodeId` INT UNSIGNED DEFAULT NULL, # NULL means no parent <=> root node
  `dummy` INT,
  `name` varchar(63) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`nodeId`),
  UNIQUE INDEX (`parentNodeId`, `name`),    # <= this right here (explanation below)
  UNIQUE INDEX (`dummy`, `name`),
  CONSTRAINT `fk_Node_Node_parentNodeId_nodeId` FOREIGN KEY (`parentNodeId`) REFERENCES `Node` (`nodeId`) ON DELETE CASCADE ON UPDATE CASCADE,
) ENGINE=InnoDB;

You could also have a trigger that updated `dummy` according to the setting of `parentNodeId`. Then there would be no change needed to application code:

CREATE TRIGGER `Node_ins`
BEFORE INSERT ON `Node`
FOR EACH ROW SET `dummy`=COALESCE(`parentNodeId`,0);

CREATE TRIGGER `Node_upd`
BEFORE UPDATE ON `Node`
FOR EACH ROW SET `dummy`=COALESCE(`parentNodeId`,0);

That's a nice solution Anthony, thanks for that!

Glad to help.

One slight correction: I should have declared `dummy` as INT UNSIGNED NOT NULL.

Vote +1