Description:
I think theres one isue in collation map used for cp1250. It's about letters:
[Ž] 142 08/14 216 0x8e CAPITAL LETTER Z WITH CARON
[ž] 158 09/14 236 0x9e SMALL LETTER Z WITH CARON
[Č] 200 12/08 310 0xc8 CAPITAL LETTER C WITH CARON
[č] 232 14/08 350 0xe8 SMALL LETTER C WITH CARON
[Š] 138 08/10 212 0x8a CAPITAL LETTER S WITH CARON
[š] 154 09/10 232 0x9a SMALL LETTER S WITH CARON
i think they are too simillar to some other letters in cp1250, and they may help most central european developpers.
THAT MEAN:
0x8e ~ 0x8f ~ 0x9e ~ 0x9f ~ 0xaf ~ 0xbf ~ 0x5a ~ 0x7a
0xc8 ~ 0xe8 ~ 0x43 ~ 0x63
0x8a ~ 0x9a ~ 0x53 ~ 0x73
May i ask, is there a way to compile own GOOD CE collation? Thank you.
How to repeat:
FOR EXAMPLE:
mysql> SELECT "á" LIKE "%a%";
+----------------+
| "á" LIKE "%a%" |
+----------------+
| 1 |
+----------------+
1 row in set (0.00 sec)
(ITS OK)
mysql> SELECT "ž" LIKE "%ź%";
+----------------+
| "ž" LIKE "%ź%" |
+----------------+
| 1 |
+----------------+
1 row in set (0.00 sec)
(OK TOO)
----------------------------------------------------------------------------
AND THE PROBLEM LETTERS EXAMPLES:
mysql> SELECT "ž" LIKE "%z%";
+----------------+
| "ž" LIKE "%z%" |
+----------------+
| 0 |
+----------------+
1 row in set (0.00 sec)
mysql> SELECT "š" LIKE "%s%";
+----------------+
| "š" LIKE "%s%" |
+----------------+
| 0 |
+----------------+
1 row in set (0.00 sec)
mysql> SELECT "č" LIKE "%c%";
+----------------+
| "č" LIKE "%c%" |
+----------------+
| 0 |
+----------------+
1 row in set (0.00 sec)
it's the same for uppers (Š, Č, Ž) because there are no relations between these characters
Suggested fix:
I think some changes in collation tables, i found xml map in cp1250.xml
OLD MAP FOR cp1250_general_ci LOOKS LIKE:
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F
30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F
40 41 42 43 46 49 4A 4B 4C 4D 4E 4F 50 52 53 55
56 57 58 59 5B 5C 5D 5E 5F 60 61 63 64 65 66 67
68 41 42 43 46 49 4A 4B 4C 4D 4E 4F 50 52 53 55
56 57 58 59 5B 5C 5D 5E 5F 60 61 7B 7C 7D 7E 7F
80 81 82 83 84 85 86 87 88 89 5A 8B 5A 5B 62 62
90 91 92 93 94 95 96 97 98 99 5A 9B 5A 5B 62 62
20 A1 A2 50 A4 41 A6 59 A8 A9 59 AB AC AD AE 62
B0 B1 B2 50 B4 B5 B6 B7 B8 41 59 BB 50 BD 50 62
58 41 41 41 41 50 45 43 44 49 49 49 49 4D 4D 46
47 53 53 55 55 55 55 D7 58 5C 5C 5C 5C 60 5B 59
58 41 41 41 41 50 45 43 44 49 49 49 49 4D 4D 46
47 53 53 55 55 55 55 F7 58 5C 5C 5C 5C 60 5B FF
----------------------------------------------------------------------------
AND I THINK THAT IT SHOULD LOOK LIKE THIS:
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F
30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F
40 41 42 43 46 49 4A 4B 4C 4D 4E 4F 50 52 53 55
56 57 58 59 5B 5C 5D 5E 5F 60 61 63 64 65 66 67
68 41 42 43 46 49 4A 4B 4C 4D 4E 4F 50 52 53 55
56 57 58 59 5B 5C 5D 5E 5F 60 61 7B 7C 7D 7E 7F
80 81 82 83 84 85 86 87 88 89 *59 8B 5A 5B *61 *61
90 91 92 93 94 95 96 97 98 99 *59 9B 5A 5B *61 *61
20 A1 A2 50 A4 41 A6 59 A8 A9 59 AB AC AD AE *61
B0 B1 B2 50 B4 B5 B6 B7 B8 41 59 BB 50 BD 50 *61
58 41 41 41 41 50 45 43 *43 49 49 49 49 4D 4D 46
47 53 53 55 55 55 55 D7 58 5C 5C 5C 5C 60 5B 59
58 41 41 41 41 50 45 43 *43 49 49 49 49 4D 4D 46
47 53 53 55 55 55 55 F7 58 5C 5C 5C 5C 60 5B FF
(changes are marked with * (asterix))
and the same problem is too in collation cp1250_czech_cs