Bug #4745 Add Vietnamese collation for the ucs2 and utf8 Unicode character sets
Submitted: 25 Jul 2004 12:54 Modified: 19 Mar 14:01
Reporter: Quan Nguyen
Status: Verified
Category:Server: Charsets Severity:S4 (Feature request)
Version:any OS:Any (All)
Assigned to: Alexander Barkov Target Version:
Triage: D5 (Feature request)

[25 Jul 2004 12:54] Quan Nguyen
Description:
Please add Vietnamese language collation for the ucs2 and utf8 Unicode character sets.
Pertinent information can be found at:

http://vietunicode.sourceforge.net/charset/vietalphabet.html
http://oss.software.ibm.com/cgi-bin/icu/lx/?d_=en_US&_=vi

How to repeat:
The Vietnamese collation is currently not supported.
[26 Jul 2004 14:14] Sergei Golubchik
It is in our todo list.
Unfortunately, MysQL now supports only primary weights in sorting and as far as we
understand secondary/tertiary weights are essencial for correct for Vietnamese collation.

Of course, the support for secondary/tertiary weights is planned, but as it would require
all our users to rebuild all their tables, it cannot be done in 4.1, but only in 5.0 or
5.1.

If the collation based on primary weights only would be enough for you we can do it
trivially and very fast (such a collation could be used either only for comparison or for
sorting, but not for both)
[27 Jul 2004 19:27] Quan Nguyen
Your consideration is much appreciated as many Vietnamese programmers in VN are using
MySQL.

Your understanding of the Vietnamese collation is correct. The collation based on primary
weights would be good enough for now for both 4.1 and 5.0; however, full support is
desired in release version of 5.0. Thank you very much.
[30 Jul 2004 12:04] Alexander Barkov
Can you please take a look into collation rules provided by Mimer:

http://developer.mimer.com/features/unicode/tailorings.htm#Vietnamese

Mimer claims Vietnames has CH, GI, KH, NG, NH, PH, TH, TR letter contractions.
Other recourses don't mention contraction.

Can you please clarify? Thanks.
[30 Jul 2004 19:43] Quan Nguyen
What Mimer has (http://developer.mimer.com/collations/charts/vietnamese.htm) are older
Vietnamese collation rules. The current one is simpler and listed at IBM's ICU site I
mentioned. I had prior contact with a Mimer developer, who helped generate the chart at
http://vietunicode.sourceforge.net/charset/v3.htm. Mimer will soon update to reflect the
modern rules.
[29 Aug 2004 9:53] Quan Nguyen
Mimer has just updated their pages to reflect the current Vietnamese collation rules.
[29 Jul 2006 8:47] Jennifer Mueller
Has anyone gone any further with this feature request?  I am currently working on a db
which would need the vietnamese language pack installed in order to display properly. 
This original thread was started in July 04, so I am curious if there's been any progress
on this since then?  Perhaps in version 5.1?  I can't seem to find it, so my guess is no. 
Can anyone answer this for me?
[23 Sep 2006 10:16] Dinh Pham
I think that this feature is demanding enough to make it happen in MySQL 5.0+
[30 Mar 2007 11:37] Trung-Kien Dao
I'm a Vietnamese website developer and administrator using MySQL. I have been waiting for
this bug to be fixed for several years, however nothing has changed up to now, and my
website is still having many troubles due to this bug without finding out any workaround,
mostly in searching and sorting data.

That is my personal case. However I think there are many other Vietnamese developers
currently having trouble with this shortcoming, as MySQL and PHP have been becoming the
most prefered and suitable for developing new websites in Vietnam.

Hope that this feature will be considered to be supported in a very new coming version of
MySQL.
Lots of thanks to the MySQL developer team.
[30 Mar 2007 11:50] Trung-Kien Dao
Hi Jennifer Mueller,
I've just walked through the MySQL 5.0 doc, here is my answer for you about the current
state (It's not been supported for short):

http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html
(Currently, the utf8_unicode_ci collation has only partial support for the Unicode
Collation Algorithm. Some characters are not supported yet. Also, combining marks are not
fully supported. This affects primarily Vietnamese and some minority languages in Russia
such as Udmurt, Tatar, Bashkir, and Mari.)

Although Vietnamese language's modern writing system is using a system of characters very
similar to the western one, but the accented characters are very widespread and present in
more than 80% of Vietnamese words. That's why this is very important and lead to many
troubles.
[31 Mar 2007 17:24] nhi ha le
Hello, i'm just an user and not having to manipulate directly with MySQL databases, but i
can see the error discussed above appear very frequently in Vietnamese website.
[1 Apr 2007 5:08] chick prete
Please add Vietnamese collocation in your next version
[1 Apr 2007 9:05] Thang Vu Quang
Hello,

It is unbelievable that MySQL does not support Vietnamese language. There are nearly 90
mil Vietnamese in the world and we will be a big market in the near future. If more
Vietnamese uses MySQL, I think there will be more contribution to open-source community.

Thanks
Meo
[4 Apr 2007 7:08] Quang Vo
Hello MySQL's Developers Team,

In the past, I was a librarian at a university. In my free time, I like using my PC to
programme with PHP & MySQL. My scripts was using by students for books management. I found
that "It's difficult for coding searching module with Vietnamse language". The results of
searching is incredible, very very chaotic.

I think MySQL's Developers Team should support Vietnamese Unicode (UTF-8) characters soon.
That will be wonderful.

Thank you very much.

Best regard,
[4 Apr 2007 9:46] Tran Duc Hoang
Please, Add Vietnamese collocation in your next version!
[4 Apr 2007 23:01] Sebastian Simon
Hi!
I am happy to see this thread is really active again, i came here a few months ago, as i
was developing a dictionary german/vietnamese for a language course with ruby and mysql
and run into the same problems with sorting
vietnamese characters.

Is there any plan to patch this problem?
[5 Apr 2007 9:12] [ name withheld ]
Dear sirs/ madams,

I'm from Vietnam. I'm learning and using your MySQL. I can write Vietnamese letters using
PHPmyAdmin with Vietnam support softwares in MySQL, but I can't make searching my letters
or words in MySQL. But when I used web applications to input letters/words, I couldn't see
my real words in MySQL as I typed in. 

So, Could you help me how to search, input, output with Vietnamese?

Nowadays, more and more Vietnamese are using your MySQL. Please help us. 
I'm expecting your reply soon.
Thank you.

Dung Tran,
dungtranck94@yahoo.com
[27 Apr 2007 15:32] Trần Văn Hùng
Please add VietNam Language to new Version!
All We need it!
Thank you!
[29 Apr 2007 22:45] tran dinh tuyen
I love MySQL, :d 
plz Add Vietnamese collation for the ucs2 and utf8 Unicode character sets
[7 May 2007 16:59] Duong Vu Hoang
I have some problems with vietnamse language .
[11 May 2007 11:25] tung nguyen
i can't search or compare a word in mysql not support utf-8. i hope you can fix
thanks you very much
[15 May 2007 8:27] Nham Ngoc Tan
Please add Vietnamese collation in the next version. I love MySQL.
[18 May 2007 15:34] Giang Nguyen
Please add Vietnamese language collation for the ucs2 and utf8 Unicode character
sets.
[21 May 2007 16:18] Thai Cao Phong
MySQL AB should fix these bugs early.

Thank
[27 May 2007 17:34] a a
- I can't sort Database with language Vietnamese. Please, help me!
[29 May 2007 10:23] Tran Thai Loc
Please add VietNam Language to new Version!
All We need it!
Thank you!
[29 May 2007 21:17] Kenny Nguyen
Please add Vietnamese unicode support to MySQL... I [we] really need it.

Thanks.
[4 Jun 2007 6:09] nguyen hoang Vu
I have some hard problem when coding PHP & Mysql with my Language: VietNamwse

Please add some support with VietNamese in next version
All We need it!
Thank you! Best Wish!
[8 Jun 2007 5:02] Huy NguyenQuang
My name Nguyen Quang Huy, come from Vietnam. I'm Joomla, Mambo, Sugar and Vtiger user. Now
i opened a opensource project on Joomla and need a full support for Vietnamese. 

Please add Vietnamese language collation for utf8 Unicode character sets.

Thank you!

huy010579@gmail.com
[8 Jun 2007 14:47] yu yu
I hope MySQL will fix this bux
[9 Jun 2007 3:17] new comer11
Hi to all members, I think we should support Vietnamese in MySQL cos this is the most
popular DB system in Viet Nam-the country with more than 80 milions people.
Thank for reading.
[9 Jun 2007 5:12] tran tuan son
can you fix error vietnam language.
[10 Jun 2007 20:19] duong loc
Deer MySql Developer !
I have been studying programming with PHP and MySql database.I'm Vietnamese.Of course, I
met this bug,so I hope MySql's Developer will fix this bug in next version to MySql is
perfected.And It's more and more developed in Viet Nam.
Thanks!

duongtanloc@gmail.com
[13 Jun 2007 7:54] Ha Thang
I'm using MySql and Open Source for my website. I see faults vietnamese with MySql
You should support vietnamese into MySql becuase there are many Vietnamese web developer
use
I hope it will come. Thank you very much
[29 Jun 2007 8:32] Nguyen Duc Phu
I'm PHP & MySQL web developer. When i working with vietnamese language on mysql, i have
some error link font not show correct, difficult to sort text by abc...because mysql not
supported Vietnamese Collation.

Please update and support it in new version!

Thanks very much
[29 Jun 2007 11:40] Bach Huy
Please fix error when I user sort vietnamese language (unicode)MySLQ not list by nature,
please check and fix this error
Thank your very much!
[2 Jul 2007 5:54] Tran vinh
i`m from vietnam. I`m a developer and i usually use mysql as my best database. My projects
usually use mysql and the data i store in mysql is vietnamese, it`s fairy good but it is
better if you "Add Vietnamese collation for the ucs2 and utf8 Unicode character sets"
Thank you so much!
[5 Jul 2007 2:32] Peter Gulutzan
Instructions for adding a new Unicode collation

Attachment: vietnamese.txt (text/plain), 19.87 KiB.

[5 Jul 2007 2:46] Peter Gulutzan
Dear Vietnamese users:

MySQL is open source, and we want to encourage community participation.
So we are giving you instructions how to make your own Unicode collation.
These are detailed instructions with illustrations for Vietnamese, that
work with MySQL 4.1, 5.0, or 5.1.

The instructions are in a file attached for this bug report.
Click vietnamese.txt on the previous comment, or go to
http://bugs.mysql.com/file.php?id=6814
or click the 'Files' tab. Eventually there may be more than
one version of the instructions, so make sure to read later comments.

Actually the instructions will probably become an article that we might
want to put in our newsletter or elsewhere. So please regard this as a
draft, and give us our feedback if you try them out.

Thank you,
Peter Gulutzan and Alexander Barkov
MySQL AB
[5 Jul 2007 10:00] Hang Nguyen
Please add vietnamese language!
[6 Jul 2007 11:43] anh the
I like language vietnamese
[8 Jul 2007 7:28] Minh Hoang Pham
Please fix this bug! Searching have problem with my forum
[29 Jul 2007 14:17] truong tuan
Please add Vietnamese collocation in your next version
[2 Aug 2007 19:18] Peter Gulutzan
Dear Vietnamese users:

We are still waiting for feedback regarding our proposal
of July 5. Participating, we think, will have more effect
than repeating the same request. Any volunteers?

Peter Gulutzan and Alexander Barkov
MySQL AB
[3 Aug 2007 5:35] Quan Nguyen
Index.xml containing complete Vietnamese collation

Attachment: Index.xml (text/xml), 25.97 KiB.

[3 Aug 2007 5:36] Quan Nguyen
Sample table with Vietnamese characters in correct order

Attachment: vi_collate.sql (application/octet-stream, text), 4.17 KiB.

[3 Aug 2007 5:39] Quan Nguyen
Incorrect results from experimental vi collation

Attachment: vi_abc.csv (application/vnd.ms-excel, text), 1.76 KiB.

[3 Aug 2007 5:52] Quan Nguyen
Hi Peter and Alex,

We thank you for the given instructions for creating new collations for Vietnamese
language. I just experimented using the second method, testing it on MySQL Community
version 5.0.45; the results, however, are still incorrect, as can be seen in 'vi_abc.csv'
attachment.

Moreover, the server does not like the utf8 collation, raising an error message as
follows:

Error while executing query.

ALTER TABLE `collation`.`letters` CHARACTER SET utf8 COLLATE utf8_vietnamese_ci;

MySQL Error Number 1273
Unknown collation: 'utf8_vietnamese_ci'

That's my feedback for now. Other developers will continue to help with more testings.

Thanks.

Quan
[3 Aug 2007 8:40] Alexander Barkov
"SELECT id,letter ORDER BY letter" - MySQL-5.0.46

Attachment: vi-5.0.46.csv (application/octet-stream, text), 1.21 KiB.

[3 Aug 2007 8:41] Alexander Barkov
"SELECT GROUP_CONCAT(letter) FROM letters GROUP BY letter"  - MySQL 5.0.46

Attachment: vi-gconcat-5.0.46.txt (text/plain), 617 bytes.

[3 Aug 2007 8:43] Alexander Barkov
Hi Quan,

We're sorry for a mistake in the article.
The second method "ADDING A NEW COLLATION BY CHANGING THE MARKUP FILE"
works only starting from 5.0.46. The patch was delayed for some reasons.

5.0.46 will be available soon.

Meanwhile, I'm attaching the result of these two queries
generated by 5.0.46:

SELECT id, letter FROM letters ORDER BY letter;

SELECT GROUP_CONCAT(letter) FROM letters;

Please check if it the results are good enough.

Thanks!
[7 Aug 2007 4:16] Quan Nguyen
Hi Alex,

From what I see in your results, they are still not correct. The id column can help in
determining the correct sort order, which is specified in
http://vietunicode.sourceforge.net/charset/vietalphabet.html or
http://demo.icu-project.org/icu-bin/locexp?d_=en&_=vi.

The SELECT GROUP_CONCAT statement should produce an ordered list similar to that depicted
in http://vietunicode.sourceforge.net/charset/v3.htm.

Thanks.
[7 Aug 2007 4:50] hieuhoc mr
Please add VietNam Language to new Version!
[7 Aug 2007 7:57] Alexander Barkov
Quan,

How many letters should this query return in Vietnamese:

"SELECT letter FROM letters WHERE letter='a';

Should it return only two records 'a' and 'A',

or should it return the whole bunch of letters
listed on the first row in this chart:
http://vietunicode.sourceforge.net/charset/v3.htm
i.e. :

à U+00E0
À U+00C0
ả U+1EA3
Ả U+1EA2
ã U+00E3
à U+00C3
á U+00E1
Á U+00C1
ạ U+1EA1
Ạ U+1EA0

Thanks!
[7 Aug 2007 8:24] Alexander Barkov
A fixed version of the "GCONCAT" query

Attachment: vi-gconcat2.txt (text/plain), 610 bytes.

[7 Aug 2007 8:25] Alexander Barkov
Quean, can you please take a look into the new result
of the "GCONCAT" query ?

Thanks!
[9 Aug 2007 6:46] sothub1 nguyen
Hello All
I wish you support MySQL collation VietNam better  . i hope new version of mysql better .

Rgds
[9 Aug 2007 8:24] tho su
Hi Barkov,

It should return only "a" and "A" (case insensitive).

regards,

Tho Su
[10 Aug 2007 20:21] Quan Nguyen
Yes, it should return only two records 'a' and 'A'.

The results in vi-gconcat2.txt look better but still not right. Can you make it look just
like the one in v3.htm?
[13 Aug 2007 3:36] tho su
Hi fellow Vietnamese,

I think we have done a good job to get the attention at MySQL development team and they
are actively working to resolve the technical issue. We should not add any more of the
"Please add Vietnamse support" on this thread. Please try to contribute rather than
putting "pressure".

Regards,

TS
[17 Aug 2007 12:48] thu nguyen-van
Dear friends,
I would like to have a DB als MySQL with a new and good collation sequence for utf-8
fonts.
I thank very much to Mr Quan Nguyen and Mr Alexander Barkov.
Engineer Nguyen-van Thu
Bruxelles, Belgium
[17 Aug 2007 15:49] Peter Gulutzan
Of course 'a' <> 'ă', and of course 'a' <> 'â'.
But why do you say that 'a' <> 'à'?
Remember that we are trying to follow the
"Vietnamese Alphabetical System" rule.
It says "A<<\u00E0".
That is, according to our reading of this rule:
U+00E0 LATIN SMALL LETTER A WITH GRAVE
is the same as U+0041 LATIN SMALL LETTER A
at the primary level -- it only differs at the
secondary level. You are saying that this rule
is false, or you are saying that we misinterpret.
Are you sure?

But if it is true, then the collation strength should be 2.
I.e. the secondary level should be taken into account not
only for sorting, but for comparison as well.

I don't know if it is a good or bad news.

Possibly we can rewrite the tailoring by moving
the accent difference to the primary level instead
of the secondary level:

&A < 00E0 <<< 00C0
    < 1EA3 <<< 1EA2
    < 00E3 <<< 00C3
    < 00E1 <<< 00C1
    < 1EA1 <<< 1EA0

and so on for the other letters.

This will be not the same as
http://vietunicode.sourceforge.net/charset/v3.htm
but maybe it will do as a temporary solution
while we're working on WL#896
"primary, secondary and tertiary levels",
which will be visible soon on forge.mysql.com.
[18 Aug 2007 16:07] Quan Nguyen
Your reading is correct. At primary level, 'a' and 'à' are equal; but at secondary level,
they are not, a << à, as an accent difference.

The same Vietnamese collation can also be expressed in a different way, as follows:

& ̀ << ̉ << ̃ << ́ << ̣
& A < ă <<< Ă < â <<< Â
& D < đ <<< Đ
& E < ê <<< Ê
& O < ô <<< Ô < ơ <<< Ơ
& U < ư <<< Ư

or

& \u0300 << \u0309 << \u0303 << \u0301 << \u0323
& A < \u0103 <<< \u0102 < \u00E2 <<< \u00C2
& D < \u0111 <<< \u0110
& E < \u00EA <<< \u00CA
& O< \u00F4 <<< \u00D4 < \u01A1 <<< \u01A0
& U< \u01B0 <<< \u01AF

If you altered the rule to get the sort correct for the time being, would you change it
back to the correct form when you finish implementing WL#896?

On the other hand, you can put in the correct rule now and things will automatically fall
into proper place with the implementation of WL#896. This way at least we get the correct
sort at the primary level.
[24 Aug 2007 6:27] nguyễn thành phước
let's be me
[24 Aug 2007 17:09] Peter Gulutzan
"
Hello,

I asked earlier:
"You [ i.e. Quan Nguyen and tho su ] are saying that this rule
is false, or you are saying that we misinterpret.
Are you sure?"

Please look at Alexander Barkov's question again.

You said that 'à' should not be returned. Are you sure?
[27 Aug 2007 16:45] Duong Thien
Hi please fix this bug
[29 Aug 2007 4:27] Quan Nguyen
Hi Peter & Bar,

"It says 'A<<\u00E0'
That is, according to our reading of this rule:
U+00E0 LATIN SMALL LETTER A WITH GRAVE
is the same as U+0041 LATIN SMALL LETTER A
at the primary level -- it only differs at the
secondary level."

Your interpretation of the rule is correct. As such, the query

"SELECT letter FROM letters WHERE letter='a';

should return only two records 'a' and 'A'.

I am positive that is the correct result set -- unless MySQL database engine performs the
compare operation (letter='a') at primary level, in which case the others would also be
included.
[5 Sep 2007 14:15] Peter Gulutzan
Yes, of course MySQL will only use the primary
weight for a comparison operation involving WHERE.
So, for 'a' << 'à', you have a choice between 
'a' = 'à' and 'a' < 'à'. You chose 'a' < 'à'.

So you want these characters to be DISTINCT and
IN ORDER, as if they're separate letters of the
Vietnamese alphabet:
a à ả ã á ạ
and so on, for every case where
http://vietunicode.sourceforge.net/charset/vietalphabet.html
says that the difference is "<<" not "<<<".

It's strange, but it's not a problem. Would any
Vietnamese users like to try making the change by
themselves, or is our help still needed?

Incidentally, version 5.1.21 is now available for download.
[14 Sep 2007 4:59] Quan Nguyen
Changing the rules would go against both the official Vietnamese rule and Unicode
Collation Algorithm.

Since MySQL uses primary weight in WHERE clause, the query should return 'a', 'à', 'ả',
'ã', 'á', and 'ạ' (and the corresponding capitals). Returning only 'a' and 'A' would
be achieved by an overriding COLLATE clause (as described in MySQL 5.0 Reference Manual),
if that is possible; or with a second rule specifically made for comparison operations.
[14 Sep 2007 19:11] tue nguyen
It's necessary to debug. Because in Viet Nam IT is growing rapidly, it's inconvenient for
us to use My SQL-which is used popularly in our IT community with this bug. Thanks for
attention.
[16 Sep 2007 1:16] Hai Phan
Hi, I want to second Quan's suggestions.

The primary weight should be the strictest, where 'a' <> 'á' <> 'à' <> 'ả' <> 'ã' <>
'ạ', because they're practically distinct letters in the Viet alphabet.  The words "con
cú" and "con cu" have very different meanings and searching for one should not return the
other also, the same way searching for "owl" in English should not return "penis."

However, in some search applications, it is convenient for the user not having to type in
the accent mark.  This is where the secondary weight should be used, where
   'a' = 'á' = 'à' = 'ả' = 'ã' = 'ạ' =
   'ă' = 'ắ' = 'ằ' = 'ẳ' = 'ẵ' = 'ặ' =
   'â' = 'ấ' = 'ầ' = 'ẩ' = 'ẫ' = 'ậ'
as is the current behavior of the utf8_general_ci collation rule.  Only few applications
need this accent-neutral effect, so it should not be the primary.  Also I want 'd' = 'đ'
in this rule, right now it is not.

Regards,
Hai Phan
[20 Sep 2007 14:47] hiep le cong
Please add Vietnamese language collation for the ucs2 and utf8 Unicode character sets.
Pertinent information can be found
[21 Sep 2007 18:57] thanh lv
OK !
hope for this problem will be solved in the near future
[26 Sep 2007 10:35] AnhTuan 123
Please add Vietnamese collation in the next version. I love MySQL.
[30 Sep 2007 17:34] Quan Nguyen
Added a second Vietnamese collation for use with overriding COLLATE clause

Attachment: Index.xml (text/xml), 31.47 KiB.

[30 Sep 2007 17:56] Quan Nguyen
With the updated rules, assuming the table or the database is configured with
utf8_vietnamese1_ci collation:

SELECT letter FROM letters WHERE letter='a';

would return 'a', 'à', 'ả', 'ã', 'á', 'ạ', and their capitals.

whereas the query

SELECT letter FROM letters WHERE letter='a' COLLATE utf8_vietnamese2_ci;

would return 'a' and its capital.
[2 Oct 2007 7:55] duc tran
I have same problems. Please add Viernamese collation in the next version of MySQL. I
think most of web developers in Vietnam use MySQL. Thank you!
[5 Oct 2007 9:39] dung nguyen
I do know that it is very useful
[11 Oct 2007 4:12] Hoang Le Nhan
i'm vietnamese website. I want Mysql support vietnamese language for this app. 
thanks
[24 Oct 2007 4:34] nguyen van dai
It is in our todo list.
Unfortunately, MysQL now supports only primary weights in sorting and as far as we
understand secondary/tertiary weights are essencial for correct for Vietnamese collation.

Of course, the support for secondary/tertiary weights is planned, but as it would require
all our users to rebuild all their tables, it cannot be done in 4.1, but only in 5.0 or
5.1.

If the collation based on primary weights only would be enough for you we can do it
trivially and very fast (such a collation could be used either only for comparison or for
sorting, but not for both)
[30 Oct 2007 4:01] anh hoang
Hello i have a problem . I cann't add four collation in UTF8. I can add two new collation
but add four collation mysql can't show all of four collations. It's only show two of
them.
I Changed the Index.xml like Alexander told http://bugs.mysql.com/file.php?id=6814 .

Four ID collation i used: 211, 212,213,214
Mysql version: 5.1
[31 Oct 2007 7:08] Thanh Bui
Hi all,
I buil mysql-5.1.22-rc by VS 2005 from source code as guide in reference manual &
http://bugs.mysql.com/bug.php?id=31217.
Then I use Quan Nguyen's Index.xml.
But:
 SELECT letter FROM letters WHERE letter='a' COLLATE utf8_vietnamese2_ci;
still return 'a', 'à', 'ả', 'ã', 'á', 'ạ', and their capitals
(= result from SELECT letter FROM letters WHERE letter='a';)
Can you tell me why???)
[8 Nov 2007 11:21] xh n
Please, 
fix some errors vietnamese font.
  Thanks.
[11 Nov 2007 22:06] Khamphouvieng Vongphachanh
Hello, I want you to help with Vietnamese character in MySQL
Please add Vietnamese collation for the usc2 and utf8 Unicode character sets
It will be easy and convenient for me to write Vietnamese characters
Thanks alot
[13 Nov 2007 20:47] Peter Gulutzan
In response to the question from [31 Oct 7:08] Thanh Bui:
This is the expected result, explanations are in earlier comments.
If you have difficulty understanding, try contacting Quan Nguyen.

In response to the other "requests" in the last month or two:
There is no need to ask constantly for a Vietnamese collation.
There is an answer. Please read earlier comments.
[18 Nov 2007 4:27] Quan Nguyen
Bar & Peter,

I moved the accent difference to the primary level for utf8_vietnamese2_ci and
ucs2_vietnamese2_ci collations, as suggested:

&A < 00E0 <<< 00C0
    < 1EA3 <<< 1EA2
    < 00E3 <<< 00C3
    < 00E1 <<< 00C1
    < 1EA1 <<< 1EA0

However, the query still returned the same result set as the vietnamese1 versions. Can you
explain? Thanks.

I'm running mysql-5.1.22 Release Candidate version.
[19 Nov 2007 10:45] Alexander Barkov
Quan, can you please attach your latest version of Index.xml ?
[20 Nov 2007 3:07] Quan Nguyen
Here we go.

Attachment: Index.xml (text/xml), 31.56 KiB.

[20 Nov 2007 6:49] Alexander Barkov
There's a mistake:

<collation name="utf8_vietnamese2_ci" id="212">
<!--Vietnamese experimental collation-->
	<rulep>

It should be <rules> instead.
[21 Nov 2007 2:24] Quan Nguyen
It works! Thank you very much. Hope the new collations will make it in the next releases
of MySQL.
[26 Nov 2007 4:10] nguyen phuong
please , Mysql add langauge vietnamese !
[12 Dec 2007 1:53] Lương Vân
I need known mysql
[12 Jan 4:25] Nguyễn Ngọc Tuân
I want to say make love . Would you like OK?
[24 Jan 14:19] minh pham
Error:
Vietnamese Collation Chart
a
0061 	A
0041 	à
00E0 	À
00C0 	ả
1EA3 	Ả
1EA2 	ã
00E3 	Ã
00C3 	á
00E1 	Á
00C1 	ạ
1EA1 	Ạ
1EA0
ă
0103 	Ă
0102 	ằ
1EB1 	Ằ
1EB0 	ẳ
1EB3 	Ẳ
1EB2 	ẵ
1EB5 	Ẵ
1EB4 	ắ
1EAF 	Ắ
1EAE 	ặ
1EB7 	Ặ
1EB6
â
00E2 	Â
00C2 	ầ
1EA7 	Ầ
1EA6 	ẩ
1EA9 	Ẩ
1EA8 	ẫ
1EAB 	Ẫ
1EAA 	ấ
1EA5 	Ấ
1EA4 	ậ
1EAD 	Ậ
1EAC
b
0062 	B
0042
c
0063 	C
0043
d
0064 	D
0044
đ
0111 	Đ
0110
e
0065 	E
0045 	è
00E8 	È
00C8 	ẻ
1EBB 	Ẻ
1EBA 	ẽ
1EBD 	Ẽ
1EBC 	é
00E9 	É
00C9 	ẹ
1EB9 	Ẹ
1EB8
ê
00EA 	Ê
00CA 	ề
1EC1 	Ề
1EC0 	ể
1EC3 	Ể
1EC2 	ễ
1EC5 	Ễ
1EC4 	ế
1EBF 	Ế
1EBE 	ệ
1EC7 	Ệ
1EC6
g
0067 	G
0047
h
0068 	H
0048
i
0069 	I
0049 	ì
00EC 	Ì
00CC 	ỉ
1EC9 	Ỉ
1EC8 	ĩ
0129 	Ĩ
0128 	í
00ED 	Í
00CD 	ị
1ECB 	Ị
1ECA
k
006B 	K
004B
l
006C 	L
004C
m
006D 	M
004D
n
006E 	N
004E
o
006F 	O
004F 	ò
00F2 	Ò
00D2 	ỏ
1ECF 	Ỏ
1ECE 	õ
00F5 	Õ
00D5 	ó
00F3 	Ó
00D3 	ọ
1ECD 	Ọ
1ECC
ô
00F4 	Ô
00D4 	ồ
1ED3 	Ồ
1ED2 	ổ
1ED5 	Ổ
1ED4 	ỗ
1ED7 	Ỗ
1ED6 	ố
1ED1 	Ố
1ED0 	ộ
1ED9 	Ộ
1ED8
ơ
01A1 	Ơ
01A0 	ờ
1EDD 	Ờ
1EDC 	ở
1EDF 	Ở
1EDE 	ỡ
1EE1 	Ỡ
1EE0 	ớ
1EDB 	Ớ
1EDA 	ợ
1EE3 	Ợ
1EE2
p
0070 	P
0050
q
0071 	Q
0051
r
0072 	R
0052
s
0073 	S
0053
t
0074 	T
0054
u
0075 	U
0055 	ù
00F9 	Ù
00D9 	ủ
1EE7 	Ủ
1EE6 	ũ
0169 	Ũ
0168 	ú
00FA 	Ú
00DA 	ụ
1EE5 	Ụ
1EE4
ư
01B0 	Ư
01AF 	ừ
1EEB 	Ừ
1EEA 	ử
1EED 	Ử
1EEC 	ữ
1EEF 	Ữ
1EEE 	ứ
1EE9 	Ứ
1EE8 	ự
1EF1 	Ự
1EF0
v
0076 	V
0056
x
0078 	X
0058
y
0079 	Y
0059 	ỳ
1EF3 	Ỳ
1EF2 	ỷ
1EF7 	Ỷ
1EF6 	ỹ
1EF9 	Ỹ
1EF8 	ý
00FD 	Ý
00DD 	ỵ
1EF5 	Ỵ
[24 Jan 16:27] Vu Nhat Chi Si
tôi muốn trở thành thành viên của diendantinhoc.net
[1 Feb 5:29] Quan Nguyen
Just downloaded, installed the latest release 5.0.51a, and determined that it still does
not have the Vietnamese collation incorporated.
[18 Feb 6:33] minh le
Please support the collation in Vietnamese unicode. We really need it.

Thank you!
[4 Mar 5:28] thien ngo
vietnamese colation in Mysql is very important and urgent. It affects on many vietnamese
developer and yourself. If you don't fix this problem, you will lost your share in
vietnam's market. Please fix it as soon as possible. I think It's very easy for you to fix
in a short time!!!
[5 Mar 18:19] Tran Vinh
Vietnamese language collation for the ucs2 and utf8 Unicode character sets is really
necessary in building a website without some foolish errors.
Please fix it. Thanks you
[8 Mar 12:32] Tuan Mac Duy
Please support the collation in Vietnamese unicode. We really need it.

Thank you!
[11 Mar 4:23] Phạm Hưng
Deer MySql Team !
Now, I am working with PHP and MySql. And,I met this bug, so I hope MySql  Developers will
fix this bug in next version.
Thanks a lot!
[19 Mar 14:01] Susanne Ebrecht
Dear Vietnamese,

unfortunately there are more then 1000 spoken languages at the Earth. Vietnamese is only
one of them. It's really hard to implement just the 400 most common languages.

For example, we also have lots of open issues with collation and Western European
languages.

Peter and Alexander already gets information for Vietnamese collation here. We will
implement this as soon as possible but we can't promise to do this at one of the next
versions.

This doesn't mean that Vietnamese is less important then Western European languages, this
just mean we have to do a lots of stuff at this topic.

Until we fix this issue you can help yourselves by adding your own collation.

Look here to get more informations:
http://forge.mysql.com/wiki/How_to_Add_a_Collation
http://forge.mysql.com/w/images/b/b7/HowToAddACollation.pdf
[22 Mar 14:17] DINH KHUYEN DO
yes,thank you
[29 Mar 13:42] Kent Willan
I have a few project with MySQL. So, I want to MySQL support collation Vietnamese.
Please help me (and different Vietnamese).
Thanks.
[6 Apr 2:49] Thanh Sang
Please add Vietnamese language collation for the ucs2 and utf8 Unicode character sets.
[15 Apr 17:07] ho dac quyen
I think MySQL's Developers Team should support Vietnamese Unicode (UTF-8) characters
soon.
That will be wonderful.

Thank you very much.

Best regard,
Ho Dac Quyen
[11 Jun 20:04] Trương Minh Tuyền
please, fix it!
[17 Jun 10:28] Duy Hai Hoang
I'm a Vietnamese website developer and administrator using MySQL. When i use Vietnamese in
my website and I want to search "đỉnh" (not Đỉnh) or arrange in alphabetical order
it's not correct! The result are "đỉnh" and "Đỉnh". Please add Vietnamese language
collation for the ucs2 and utf8 Unicode character sets.
[19 Jun 17:30] le bich vien
hello ! my name is lebichvien
[26 Jun 7:34] van thanh tran
please help us !
i love mysql