Bug #33452 Primary difference between capital and small letters U and O
Submitted: 21 Dec 2007 9:23 Modified: 6 Feb 2008 17:59
Reporter: Alexander Barkov Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:6.0 OS:Any
Assigned to: Alexander Barkov CPU Architecture:Any

[21 Dec 2007 9:23] Alexander Barkov
Description:
"O" and "o" have different weights on the primary level.
The same problem happens with "U" and "u".

This is perfectly seen on this chart:

http://www.collation-charts.org/mysql60/mysql604.latin2_czech_cs.html

Capital letters O and U are on separate lines from their
lower case counterparts, which means primary difference.

How to repeat:
Correct result (no primary difference):
=======================================

mysql> select hex(weight_string(_latin2'a' collate latin2_czech_cs level 1));
+----------------------------------------------------------------+
| hex(weight_string(_latin2'a' collate latin2_czech_cs level 1)) |
+----------------------------------------------------------------+
| 8201                                                           |
+----------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select hex(weight_string(_latin2'A' collate latin2_czech_cs level 1));
+----------------------------------------------------------------+
| hex(weight_string(_latin2'A' collate latin2_czech_cs level 1)) |
+----------------------------------------------------------------+
| 8201                                                           |
+----------------------------------------------------------------+
1 row in set (0.00 sec)

Wrong results ("O" is primary greater than "o", the same for "U" and "u")
=========================================================================

mysql> select hex(weight_string(_latin2'O' collate latin2_czech_cs level 1));
+----------------------------------------------------------------+
| hex(weight_string(_latin2'O' collate latin2_czech_cs level 1)) |
+----------------------------------------------------------------+
| 9301                                                           |
+----------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select hex(weight_string(_latin2'o' collate latin2_czech_cs level 1));
+----------------------------------------------------------------+
| hex(weight_string(_latin2'o' collate latin2_czech_cs level 1)) |
+----------------------------------------------------------------+
| 9201                                                           |
+----------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select hex(weight_string(_latin2'u' collate latin2_czech_cs level 1));
+----------------------------------------------------------------+
| hex(weight_string(_latin2'u' collate latin2_czech_cs level 1)) |
+----------------------------------------------------------------+
| 9B01                                                           |
+----------------------------------------------------------------+
1 row in set (0.01 sec)

mysql> select hex(weight_string(_latin2'U' collate latin2_czech_cs level 1));
+----------------------------------------------------------------+
| hex(weight_string(_latin2'U' collate latin2_czech_cs level 1)) |
+----------------------------------------------------------------+
| 9C01                                                           |
+----------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> 

Suggested fix:
Make "O" primary equal to "o",
make "U" primary equal to "u".
[21 Dec 2007 11:35] MySQL Verification Team
Thank you for the bug report.
[10 Jan 2008 11:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40836

ChangeSet@1.2706, 2008-01-10 15:51:02+04:00, bar@mysql.com +3 -0
  Bug#33452 Primary difference between capital and small letters U and O
  Problem: wrong primary weight for all variants of capital letters "U" and "O".
  Fix: changing primary weights for the letters "U" and "O"
  to be equal to their small counterparts.
[10 Jan 2008 14:17] Sergey Vojtovich
Ok to push.
[11 Jan 2008 12:26] Alexander Barkov
Pushed into rpl team tree, currently marked as 6.0.4.
[5 Feb 2008 13:08] Bugs System
Pushed into 6.0.5-alpha
[6 Feb 2008 17:59] Paul DuBois
Noted in 6.0.5 changelog.

For the latin2_czech_cs collation, the primary weights for all
variants of capital letters U and O were incorrect (were not equal to
the corresponding small letters).