Bug #64004 Mysql see two different unicode characters as the same
Submitted: 12 Jan 2012 4:12 Modified: 12 Jan 2012 15:54
Reporter: Herbert Lie Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: CPU Architecture:Any

[12 Jan 2012 4:12] Herbert Lie
Description:
I put a unique constraint for a name field and basically i want to insert two different name:
simone perele
simone pérèle

It gives me Error Code:1062- Duplicate entry 'simone pérèle' for key 'uniqu'

How to repeat:
CREATE TABLE `test_table` (

  `id` int(11) NOT NULL,

  `name` varchar(45) NOT NULL,

  PRIMARY KEY (`id`),

  UNIQUE KEY `uniqu` (`name`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8

insert into test_table(id,name) values (1,'simone perele');
insert into test_table(id,name) values (2,'simone pérèle');
[12 Jan 2012 4:14] Herbert Lie
Title is misleading. 
What I meant is utf 8
[12 Jan 2012 8:51] Peter Laursen
You should understand collations - http://dev.mysql.com/doc/refman/5.1/en/charset.html

see this:

SET NAMES UTF8;
SELECT 'e' = 'è' COLLATE utf8_general_ci; -- returns '1'
SELECT 'e' = 'è' COLLATE utf8_unicode_ci; -- returns '1'
SELECT 'e' = 'è' COLLATE utf8_bin; -- returns '0';

I don't know if other utf8 collations than utf8_bin distingish e and è. If not you will have to either omit the unique constraint on that column or define the column with "COLLATE utf8_bin".

Peter
(not a MySQL person)
[12 Jan 2012 15:54] Sveta Smirnova
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://dev.mysql.com/doc/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

Read Peter's comment and use table at http://www.collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html