Bug #4584 utf8_swedish_ci doesn't treat 'v' and 'w' equally
Submitted: 17 Jul 2004 14:08 Modified: 19 Jul 2004 14:01
Reporter: Bengan B Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.1.3-beta OS:Woody
Assigned to: Alexander Barkov CPU Architecture:Any

[17 Jul 2004 14:08] Bengan B
Description:
I wanted to try the new nationalized unicode collations. But I found a minor flaw in the rules.
I am not an expert, but I believe I am correct about this, correct me if I'm wrong:

How to repeat:
(
  SELECT CONVERT( 'vb' USING utf8 ) AS a
)
UNION (
  SELECT CONVERT( 'wa' USING utf8 ) AS a
)
ORDER BY a COLLATE utf8_swedish_ci ASC
#
# This places 'vb' before 'wa', although 'v' and 'w' are supposed to be equal when sorting Swedish.

Suggested fix:
The letters 'v' and 'w' should be treated equally.
[17 Jul 2004 14:09] Bengan B
Altered the synopsis.
[19 Jul 2004 10:21] Heikki Tuuri
Hi!

Also Finnish uses latin1_swedish_ci, and in most cases v is NOT sorted equal to w. 

The only case that I know where they have an equal sort position is in the telephone directory of Helsinki.

Thus, best to keep v < w.

Regards,

Heikki
[19 Jul 2004 14:01] Sergei Golubchik
I'll close the issue, because it was not a bug but a deliberate decision.
But you are welcome to add to MySQL the collation you need: utf8_swedishphonebook_ci (or call it utf8_swedish2_ci).

It is really trivial, see for example, how Roman collation (for Latin language) was added: http://lists.mysql.com/internals/15023. See also utf8_swedish_ci in ctype-uca.c