Bug #18879 Characters with or without diacritical marks are treated equal.
Submitted: 7 Apr 2006 8:41 Modified: 10 Apr 2006 1:19
Reporter: Grace Coronado Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.0.18 OS:Windows (MS-XP)
Assigned to: CPU Architecture:Any

[7 Apr 2006 8:41] Grace Coronado
Description:
Characters like: ñ = n, ä = a, ö = o, é = e, ê = e, ü = u   are treated equal in MySQL.

How to repeat:
Our current settings:
    Windows XP 5.1
    MS Access 2002
    MySQL Server 5.0.18
    MyODBC 3.51.12 
    MS Jet Engine 4.0

Using our data as example, the following entries of fish species synonyms, Acipenser guldenstadti and Acipenser güldenstädti should be treated as different entries.  Due to this case, it limits us to use SYNGENUS and SYNSPECIES as primary key.  See list for more example.

+----------------+---------------+
| syngenus       | synspecies    |
+----------------+---------------+
| Acipenser      | guldenstadti  |
| Acipenser      | güldenstädti  |
| Acipenser      | guldenstaedti |
| Acipenser      | güldenstaedti |
| Blicca         | bjorkna       |
| Blicca         | björkna       |
| Brycon         | stubelii      |
| Brycon         | stübelii      |
| Diaphus        | lutkeni       |
| Diaphus        | lütkeni       |
| Myletes        | tiete         |
| Myletes        | tieté         |
| Pseudocetopsis | baudoensis    |
| Pseudocetopsis | baudoênsis    |
+----------------+---------------+
[7 Apr 2006 10:53] Hartmut Holzgraefe
This depends on the collation (language specific character set sorting/comparison rules)
of the field, see http://dev.mysql.com/doc/refman/4.1/en/charset.html for more details
[10 Apr 2006 1:19] Grace Coronado
Here are the current character settings that we have:

mysql> show variables like "%char%";
+--------------------------+---------------------------------------------------------+
| Variable_name            | Value                                                   |
+--------------------------+---------------------------------------------------------+
| character_set_client     | latin1                                                  |
| character_set_connection | utf8                                                    |
| character_set_database   | utf8                                                    |
| character_set_results    | latin1                                                  |
| character_set_server     | latin1                                                  |
| character_set_system     | utf8                                                    |
| character_sets_dir       | C:\Program Files\MySQL\MySQL Server 5.0\share\charsets\ |
+--------------------------+---------------------------------------------------------+
7 rows in set
 
mysql> show variables like "%collation%";
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | utf8_general_ci   |
| collation_database   | utf8_general_ci   |
| collation_server     | latin1_swedish_ci |
+----------------------+-------------------+
3 rows in set

What character setting am I supposed to use?