MySQL Bugs: #8608: The ucs2_unicode_ci collation fails with several Cyrillic alphabets

Bug #8608	The ucs2_unicode_ci collation fails with several Cyrillic alphabets
Submitted:	18 Feb 2005 20:59	Modified:	26 Feb 2005 18:57
Reporter:	Peter Gulutzan	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server	Severity:	S3 (Non-critical)
Version:	5.0.3-alpha-debug	OS:	Linux (SUSE 9.2)
Assigned to:	Paul DuBois	CPU Architecture:	Any

Description:
According to the MySQL Reference Manual,
"utf8_general_ci does not support expansions ...  But in other respects, [utf8_general_ci]
tries to reproduce utf8_unicode_ci as much as possible."
But ucs2_unicode_ci (which works the same as utf8_unicode_ci) works with Bulgarian,
Serbian, and (most) Ukrainian characters, while ucs2_general_ci doesn't work.
 

How to repeat:
mysql> create table cy (s1 char character set ucs2);
Query OK, 0 rows affected (0.00 sec)

mysql> insert into cy values (0x0452) /* Serbian small letter tshe */;
Query OK, 1 row affected (0.00 sec)

mysql> insert into cy values (0x0406) /* Ukrainian capital letter I */;
Query OK, 1 row affected (0.00 sec)

mysql> insert into cy values (0x0430) /* Cyrillic capital letter A */;
Query OK, 1 row affected (0.00 sec)

mysql> insert into cy values (0x044f) /* Cyrillic small letter ia */;
Query OK, 1 row affected (0.00 sec)

mysql> select hex(s1) from cy order by s1 collate ucs2_unicode_ci;
+---------+
| hex(s1) |
+---------+
| 0430    |
| 0452    |
| 0406    |
| 044F    |
+---------+
4 rows in set (0.00 sec)

mysql> select hex(s1) from cy order by s1 collate ucs2_general_ci;
+---------+
| hex(s1) |
+---------+
| 0452    |
| 0406    |
| 0430    |
| 044F    |
+---------+
4 rows in set (0.01 sec)

Thank you for your bug report. This issue has been addressed in the
documentation. The updated documentation will appear on our website
shortly, and will be included in the next release of the relevant
product(s).