Bug #33604 duplicate key entry error for value that ends with 0x20
Submitted: 31 Dec 2007 22:26 Modified: 2 Jan 2008 20:54
Reporter: cary sweet Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.0.45 OS:Windows (vista)
Assigned to: Heikki Tuuri CPU Architecture:Any

[31 Dec 2007 22:26] cary sweet
Description:
This may be similar to Bug#30282
I have a table with a unique index on a varchar column
During a migration of a database to MySql I get the failure that it will not copy the value "Gifts " when "Gifts" is already in the table. (Duplicate entry 'Gifts ' for key 1).  The error message states that MySql is not truncating the space before insertion and therefore should be unique.

The only workaround for this is to remove the index from the database and force the software to do all the validation.  I think that this should be a critical bug, since I cannot bring the database into production without software changes, but a workaround external to MySql does exist.

How to repeat:
create table name (value varchar(100));
create unique index unq_name_01 on name (value);
insert into name values ('Gifts'),('Gifts ');

results in Duplicate entry 'Gifts ' for key 1
[2 Jan 2008 13:01] Heikki Tuuri
This is the ANSI SQL behavior. Strings are compared with space-padding at the end.

You can use the VARBINARY type if you want the comparison to be not padded.
[2 Jan 2008 20:54] cary sweet
What you are telling me is that Oracle and SQL Server are both not ANSI SQL compliant when comparing strings?  Both of these databases allow for 'Gifts ' and 'Gifts' to be unique.  Also does VARBINARY support character sets and String comparisons within the sql query?
[2 Jan 2008 22:27] Sergei Golubchik
Comparison rules are defined by the collation. The standard only says
"
  The comparison of two character string expressions depends on the collation
  used for the comparison (see Subclause 9.13, "Collation determination").
  When values of unequal length are compared, if the collation for the
  comparison has the NO PAD characteristic and the shorter value is equal to
  some prefix of the longer value, then the shorter value is considered less
  than the longer value. If the collation for the comparison has the PAD SPACE
  characteristic, for the purposes of the comparison, the shorter value is
  effectively extended to the length of the longer by concatenation of
  <space>s on the right.
"

And according to the MySQL manual (the section about CHAR and VARCHAR http://dev.mysql.com/doc/refman/5.0/en/char.html)
"
  All MySQL collations are of type PAD SPACE. 
"
It means 'Gifts' = 'Gifts'
[3 Jan 2008 14:20] Heikki Tuuri
In VARCHAR2, Oracle uses a non-padded comparison:

http://download-uk.oracle.com/docs/cd/B14117_01/server.101/b10759/sql_elements002.htm

In CHAR, it uses space-padded.