Bug #37448 C with cedilla and tilde duplicate key unique entry event up
Submitted: 17 Jun 2008 14:43 Modified: 19 Jun 2008 18:21
Reporter: Alex Marquez Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.51 OS:Any
Assigned to: CPU Architecture:Any
Tags: cedilla tilde duplicate key

[17 Jun 2008 14:43] Alex Marquez
Description:
Tested in:

1)RH5.1
Linux Box: Linux prehost2 2.6.18-53.el5 #1 SMP Wed Oct 10 16:34:19 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
Package: MySQL-server-community-5.0.51a-0.rhel5.x86_64.rpm
and package: MySQL-server-community-5.1.25-0.rhel5.x86_64.rpm

2)Debian 4.0 etc
Linux Box: Linux debian 2.6.18-4-686 #1 SMP Mon Mar 26 17:17:36 UTC 2007 i686 GNU/Linux
Package: Version: 5.0.32-7etch5

C with cedilla and tilde unique index key. Duplicate key event up.
(works fine with ñ)
Tested with latin1 and utf8 to avoid http://bugs.mysql.com/bug.php?id=13145 that says

"C-cedilla is 0xC387 in UTF-8.  Thus trailing 0x87 is removed
from the query when it is sent to the server, because 0x87 is
considered to be a control character in latin1."

How to repeat:
I will attach a file after bug will be opened due to size.

Suggested fix:
*
[17 Jun 2008 14:44] Alex Marquez
How to repeat

Attachment: bug.txt.txt (text/plain), 12.60 KiB.

[17 Jun 2008 15:00] Alex Marquez
To avoid conversion gzipped

Attachment: bug.txt.gz (application/x-gzip, text), 1.67 KiB.

[19 Jun 2008 16:26] MySQL Verification Team
Thank you for the bug report. Please verify your correct client encoding to use utf8:

Your MySQL connection id is 2
Server version: 5.0.64-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> show variables like "%char%";
+--------------------------+--------------------------------------------+
| Variable_name            | Value                                      |
+--------------------------+--------------------------------------------+
| character_set_client     | latin1                                     | 
| character_set_connection | latin1                                     | 
| character_set_database   | latin1                                     | 
| character_set_filesystem | binary                                     | 
| character_set_results    | latin1                                     | 
| character_set_server     | latin1                                     | 
| character_set_system     | utf8                                       | 
| character_sets_dir       | /home/miguel/dbs/5.0/share/mysql/charsets/ | 
+--------------------------+--------------------------------------------+
8 rows in set (0.00 sec)

mysql> select 'c' = 'ç';
+-----------+
| 'c' = 'ç' |
+-----------+
|         1 | 
+-----------+
1 row in set (0.03 sec)

mysql> SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> select 'c' = 'ç';
+-----------+
| 'c' = 'ç' |
+-----------+
|         0 | 
+-----------+
1 row in set (0.00 sec)

mysql>
[19 Jun 2008 16:46] Alex Marquez
Th for reply.
The problem is only in a unique key.
The are no problem in other field type or un-indexed.

Try to insert in a unique key, cat and çat to easy reproduction.

+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

mysql> SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from documents;
+----+----------+-----------+---------------------+-----------------+---------------------------+
| id | group_id | group_id2 | date_added          | title           | content                   |
+----+----------+-----------+---------------------+-----------------+---------------------------+
|  1 |        6 |         6 | 2008-06-05 10:24:13 | ivantheterrible | spanish char test ññññíóú |
|  2 |        6 |         6 | 2008-06-05 10:24:13 |                 | spanish char test         |
|  3 |        6 |         6 | 2008-06-05 10:24:13 | cat             | spanish char test         |
+----+----------+-----------+---------------------+-----------------+---------------------------+
3 rows in set (0.00 sec)

mysql> insert into documents values (4,6,6,'2008-06-05 10:24:13', 'çat', 'spanish char test ññññíóú');
ERROR 1062 (23000): Duplicate entry '' for key 2
[19 Jun 2008 16:50] Alex Marquez
Th for reply.
The problem is only in a unique key.
The are no problem in other field type or un-indexed.

Try to insert in a unique key, cat and ç to easy reproduction.

+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

mysql> SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from documents;
+----+----------+-----------+---------------------+-----------------+---------------------------+
| id | group_id | group_id2 | date_added          | title           | content                   |
+----+----------+-----------+---------------------+-----------------+---------------------------+
|  1 |        6 |         6 | 2008-06-05 10:24:13 | ivantheterrible | spanish char test ñí |
|  2 |        6 |         6 | 2008-06-05 10:24:13 |                 | spanish char test         |
|  3 |        6 |         6 | 2008-06-05 10:24:13 | cat             | spanish char test         |
+----+----------+-----------+---------------------+-----------------+---------------------------+
3 rows in set (0.00 sec)

mysql> insert into documents values (4,6,6,'2008-06-05 10:24:13', 'ç', 'spanish char test ñí');
ERROR 1062 (23000): Duplicate entry '' for key 2
[19 Jun 2008 18:21] Sveta Smirnova
Thank you for the report.

You use collation latin1_spanish_ci for the problem column.

But according to http://www.collation-charts.org/mysql60/mysql604.latin1_spanish_ci.html C and Ç, as well as I and Ì are considered equal.

So I close the report as "Not a Bug". See also http://dev.mysql.com/doc/refman/5.0/en/charset-we-sets.html