Bug #4745 Add Vietnamese collation for the ucs2 and utf8 Unicode character sets
Submitted: 25 Jul 2004 10:54 Modified: 29 Jul 2010 13:39
Reporter: Quan Nguyen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S4 (Feature request)
Version:any OS:Any (All)
Assigned to: Alexander Barkov
Triage: Needs Triage: D5 (Feature request)

[25 Jul 2004 10:54] Quan Nguyen
Description:
Please add Vietnamese language collation for the ucs2 and utf8 Unicode character sets. Pertinent information can be found at:

http://vietunicode.sourceforge.net/charset/vietalphabet.html
http://oss.software.ibm.com/cgi-bin/icu/lx/?d_=en_US&_=vi

How to repeat:
The Vietnamese collation is currently not supported.
[26 Jul 2004 12:14] Sergei Golubchik
It is in our todo list.
Unfortunately, MysQL now supports only primary weights in sorting and as far as we understand secondary/tertiary weights are essencial for correct for Vietnamese collation.

Of course, the support for secondary/tertiary weights is planned, but as it would require all our users to rebuild all their tables, it cannot be done in 4.1, but only in 5.0 or 5.1.

If the collation based on primary weights only would be enough for you we can do it trivially and very fast (such a collation could be used either only for comparison or for sorting, but not for both)
[27 Jul 2004 17:27] Quan Nguyen
Your consideration is much appreciated as many Vietnamese programmers in VN are using MySQL.

Your understanding of the Vietnamese collation is correct. The collation based on primary weights would be good enough for now for both 4.1 and 5.0; however, full support is desired in release version of 5.0. Thank you very much.
[30 Jul 2004 10:04] Alexander Barkov
Can you please take a look into collation rules provided by Mimer:

http://developer.mimer.com/features/unicode/tailorings.htm#Vietnamese

Mimer claims Vietnames has CH, GI, KH, NG, NH, PH, TH, TR letter contractions.
Other recourses don't mention contraction.

Can you please clarify? Thanks.
[30 Jul 2004 17:43] Quan Nguyen
What Mimer has (http://developer.mimer.com/collations/charts/vietnamese.htm) are older Vietnamese collation rules. The current one is simpler and listed at IBM's ICU site I mentioned. I had prior contact with a Mimer developer, who helped generate the chart at http://vietunicode.sourceforge.net/charset/v3.htm. Mimer will soon update to reflect the modern rules.
[29 Aug 2004 7:53] Quan Nguyen
Mimer has just updated their pages to reflect the current Vietnamese collation rules.
[29 Jul 2006 6:47] Jennifer Mueller
Has anyone gone any further with this feature request?  I am currently working on a db which would need the vietnamese language pack installed in order to display properly.  This original thread was started in July 04, so I am curious if there's been any progress on this since then?  Perhaps in version 5.1?  I can't seem to find it, so my guess is no.  Can anyone answer this for me?
[23 Sep 2006 8:16] Dinh Pham
I think that this feature is demanding enough to make it happen in MySQL 5.0+
[30 Mar 2007 9:37] Trung-Kien Dao
I'm a Vietnamese website developer and administrator using MySQL. I have been waiting for this bug to be fixed for several years, however nothing has changed up to now, and my website is still having many troubles due to this bug without finding out any workaround, mostly in searching and sorting data.

That is my personal case. However I think there are many other Vietnamese developers currently having trouble with this shortcoming, as MySQL and PHP have been becoming the most prefered and suitable for developing new websites in Vietnam.

Hope that this feature will be considered to be supported in a very new coming version of MySQL.
Lots of thanks to the MySQL developer team.
[30 Mar 2007 9:50] Trung-Kien Dao
Hi Jennifer Mueller,
I've just walked through the MySQL 5.0 doc, here is my answer for you about the current state (It's not been supported for short):

http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html
(Currently, the utf8_unicode_ci collation has only partial support for the Unicode Collation Algorithm. Some characters are not supported yet. Also, combining marks are not fully supported. This affects primarily Vietnamese and some minority languages in Russia such as Udmurt, Tatar, Bashkir, and Mari.)

Although Vietnamese language's modern writing system is using a system of characters very similar to the western one, but the accented characters are very widespread and present in more than 80% of Vietnamese words. That's why this is very important and lead to many troubles.
[31 Mar 2007 15:24] nhi ha le
Hello, i'm just an user and not having to manipulate directly with MySQL databases, but i can see the error discussed above appear very frequently in Vietnamese website.
[1 Apr 2007 3:08] chick prete
Please add Vietnamese collocation in your next version
[1 Apr 2007 7:05] Thang Vu Quang
Hello,

It is unbelievable that MySQL does not support Vietnamese language. There are nearly 90 mil Vietnamese in the world and we will be a big market in the near future. If more Vietnamese uses MySQL, I think there will be more contribution to open-source community.

Thanks
Meo
[4 Apr 2007 5:08] Quang Vo
Hello MySQL's Developers Team,

In the past, I was a librarian at a university. In my free time, I like using my PC to programme with PHP & MySQL. My scripts was using by students for books management. I found that "It's difficult for coding searching module with Vietnamse language". The results of searching is incredible, very very chaotic.

I think MySQL's Developers Team should support Vietnamese Unicode (UTF-8) characters soon. That will be wonderful.

Thank you very much.

Best regard,
[4 Apr 2007 7:46] Tran Duc Hoang
Please, Add Vietnamese collocation in your next version!
[4 Apr 2007 21:01] Sebastian Simon
Hi!
I am happy to see this thread is really active again, i came here a few months ago, as i was developing a dictionary german/vietnamese for a language course with ruby and mysql and run into the same problems with sorting
vietnamese characters.

Is there any plan to patch this problem?
[5 Apr 2007 7:12] [ name withheld ]
Dear sirs/ madams,

I'm from Vietnam. I'm learning and using your MySQL. I can write Vietnamese letters using PHPmyAdmin with Vietnam support softwares in MySQL, but I can't make searching my letters or words in MySQL. But when I used web applications to input letters/words, I couldn't see my real words in MySQL as I typed in. 

So, Could you help me how to search, input, output with Vietnamese?

Nowadays, more and more Vietnamese are using your MySQL. Please help us. 
I'm expecting your reply soon.
Thank you.

Dung Tran,
dungtranck94@yahoo.com
[27 Apr 2007 13:32] Tran Hung
Please add VietNam Language to new Version!
All We need it!
Thank you!
[29 Apr 2007 20:45] tran dinh tuyen
I love MySQL, :d 
plz Add Vietnamese collation for the ucs2 and utf8 Unicode character sets
[7 May 2007 14:59] Duong Vu Hoang
I have some problems with vietnamse language .
[11 May 2007 9:25] tung nguyen
i can't search or compare a word in mysql not support utf-8. i hope you can fix
thanks you very much
[15 May 2007 6:27] Nham Ngoc Tan
Please add Vietnamese collation in the next version. I love MySQL.
[18 May 2007 13:34] Giang Nguyen
Please add Vietnamese language collation for the ucs2 and utf8 Unicode character
sets.
[21 May 2007 14:18] Thai Cao Phong
MySQL AB should fix these bugs early.

Thank
[27 May 2007 15:34] a a
- I can't sort Database with language Vietnamese. Please, help me!
[29 May 2007 8:23] Tran Thai Loc
Please add VietNam Language to new Version!
All We need it!
Thank you!
[29 May 2007 19:17] Kenny Nguyen
Please add Vietnamese unicode support to MySQL... I [we] really need it.

Thanks.
[4 Jun 2007 4:09] nguyen hoang Vu
I have some hard problem when coding PHP & Mysql with my Language: VietNamwse

Please add some support with VietNamese in next version
All We need it!
Thank you! Best Wish!
[8 Jun 2007 3:02] Huy NguyenQuang
My name Nguyen Quang Huy, come from Vietnam. I'm Joomla, Mambo, Sugar and Vtiger user. Now i opened a opensource project on Joomla and need a full support for Vietnamese. 

Please add Vietnamese language collation for utf8 Unicode character sets.

Thank you!

huy010579@gmail.com
[8 Jun 2007 12:47] yu yu
I hope MySQL will fix this bux
[9 Jun 2007 1:17] new comer11
Hi to all members, I think we should support Vietnamese in MySQL cos this is the most popular DB system in Viet Nam-the country with more than 80 milions people.
Thank for reading.
[9 Jun 2007 3:12] tran tuan son
can you fix error vietnam language.
[10 Jun 2007 18:19] duong loc
Deer MySql Developer !
I have been studying programming with PHP and MySql database.I'm Vietnamese.Of course, I met this bug,so I hope MySql's Developer will fix this bug in next version to MySql is perfected.And It's more and more developed in Viet Nam.
Thanks!

duongtanloc@gmail.com
[13 Jun 2007 5:54] Ha Thang
I'm using MySql and Open Source for my website. I see faults vietnamese with MySql
You should support vietnamese into MySql becuase there are many Vietnamese web developer use
I hope it will come. Thank you very much
[29 Jun 2007 6:32] Nguyen Duc Phu
I'm PHP & MySQL web developer. When i working with vietnamese language on mysql, i have some error link font not show correct, difficult to sort text by abc...because mysql not supported Vietnamese Collation.

Please update and support it in new version!

Thanks very much
[29 Jun 2007 9:40] Bach Huy
Please fix error when I user sort vietnamese language (unicode)MySLQ not list by nature, please check and fix this error
Thank your very much!
[2 Jul 2007 3:54] Tran vinh
i`m from vietnam. I`m a developer and i usually use mysql as my best database. My projects usually use mysql and the data i store in mysql is vietnamese, it`s fairy good but it is better if you "Add Vietnamese collation for the ucs2 and utf8 Unicode character sets"
Thank you so much!
[5 Jul 2007 0:32] Peter Gulutzan
Instructions for adding a new Unicode collation

Attachment: vietnamese.txt (text/plain), 19.87 KiB.

[5 Jul 2007 0:46] Peter Gulutzan
Dear Vietnamese users:

MySQL is open source, and we want to encourage community participation.
So we are giving you instructions how to make your own Unicode collation.
These are detailed instructions with illustrations for Vietnamese, that
work with MySQL 4.1, 5.0, or 5.1.

The instructions are in a file attached for this bug report.
Click vietnamese.txt on the previous comment, or go to
http://bugs.mysql.com/file.php?id=6814
or click the 'Files' tab. Eventually there may be more than
one version of the instructions, so make sure to read later comments.

Actually the instructions will probably become an article that we might
want to put in our newsletter or elsewhere. So please regard this as a
draft, and give us our feedback if you try them out.

Thank you,
Peter Gulutzan and Alexander Barkov
MySQL AB
[5 Jul 2007 8:00] Hang Nguyen
Please add vietnamese language!
[6 Jul 2007 9:43] anh the
I like language vietnamese
[8 Jul 2007 5:28] Minh Hoang Pham
Please fix this bug! Searching have problem with my forum
[29 Jul 2007 12:17] truong tuan
Please add Vietnamese collocation in your next version
[2 Aug 2007 17:18] Peter Gulutzan
Dear Vietnamese users:

We are still waiting for feedback regarding our proposal
of July 5. Participating, we think, will have more effect
than repeating the same request. Any volunteers?

Peter Gulutzan and Alexander Barkov
MySQL AB
[3 Aug 2007 3:35] Quan Nguyen
Index.xml containing complete Vietnamese collation

Attachment: Index.xml (text/xml), 25.97 KiB.

[3 Aug 2007 3:36] Quan Nguyen
Sample table with Vietnamese characters in correct order

Attachment: vi_collate.sql (application/octet-stream, text), 4.17 KiB.

[3 Aug 2007 3:39] Quan Nguyen
Incorrect results from experimental vi collation

Attachment: vi_abc.csv (application/vnd.ms-excel, text), 1.76 KiB.

[3 Aug 2007 3:52] Quan Nguyen
Hi Peter and Alex,

We thank you for the given instructions for creating new collations for Vietnamese language. I just experimented using the second method, testing it on MySQL Community version 5.0.45; the results, however, are still incorrect, as can be seen in 'vi_abc.csv' attachment.

Moreover, the server does not like the utf8 collation, raising an error message as follows:

Error while executing query.

ALTER TABLE `collation`.`letters` CHARACTER SET utf8 COLLATE utf8_vietnamese_ci;

MySQL Error Number 1273
Unknown collation: 'utf8_vietnamese_ci'

That's my feedback for now. Other developers will continue to help with more testings.

Thanks.

Quan
[3 Aug 2007 6:40] Alexander Barkov
"SELECT id,letter ORDER BY letter" - MySQL-5.0.46

Attachment: vi-5.0.46.csv (application/octet-stream, text), 1.21 KiB.

[3 Aug 2007 6:41] Alexander Barkov
"SELECT GROUP_CONCAT(letter) FROM letters GROUP BY letter"  - MySQL 5.0.46

Attachment: vi-gconcat-5.0.46.txt (text/plain), 617 bytes.

[3 Aug 2007 6:43] Alexander Barkov
Hi Quan,

We're sorry for a mistake in the article.
The second method "ADDING A NEW COLLATION BY CHANGING THE MARKUP FILE"
works only starting from 5.0.46. The patch was delayed for some reasons.

5.0.46 will be available soon.

Meanwhile, I'm attaching the result of these two queries
generated by 5.0.46:

SELECT id, letter FROM letters ORDER BY letter;

SELECT GROUP_CONCAT(letter) FROM letters;

Please check if it the results are good enough.

Thanks!
[7 Aug 2007 2:16] Quan Nguyen
Hi Alex,

From what I see in your results, they are still not correct. The id column can help in determining the correct sort order, which is specified in http://vietunicode.sourceforge.net/charset/vietalphabet.html or http://demo.icu-project.org/icu-bin/locexp?d_=en&_=vi.

The SELECT GROUP_CONCAT statement should produce an ordered list similar to that depicted in http://vietunicode.sourceforge.net/charset/v3.htm.

Thanks.
[7 Aug 2007 2:50] hieuhoc mr
Please add VietNam Language to new Version!
[7 Aug 2007 5:57] Alexander Barkov
Quan,

How many letters should this query return in Vietnamese:

"SELECT letter FROM letters WHERE letter='a';

Should it return only two records 'a' and 'A',

or should it return the whole bunch of letters
listed on the first row in this chart:
http://vietunicode.sourceforge.net/charset/v3.htm
i.e. :

à U+00E0
À U+00C0
ả U+1EA3
Ả U+1EA2
ã U+00E3
à U+00C3
á U+00E1
Á U+00C1
ạ U+1EA1
Ạ U+1EA0

Thanks!
[7 Aug 2007 6:24] Alexander Barkov
A fixed version of the "GCONCAT" query

Attachment: vi-gconcat2.txt (text/plain), 610 bytes.

[7 Aug 2007 6:25] Alexander Barkov
Quean, can you please take a look into the new result
of the "GCONCAT" query ?

Thanks!
[9 Aug 2007 4:46] sothub1 nguyen
Hello All
I wish you support MySQL collation VietNam better  . i hope new version of mysql better .

Rgds
[9 Aug 2007 6:24] tho su
Hi Barkov,

It should return only "a" and "A" (case insensitive).

regards,

Tho Su
[10 Aug 2007 18:21] Quan Nguyen
Yes, it should return only two records 'a' and 'A'.

The results in vi-gconcat2.txt look better but still not right. Can you make it look just like the one in v3.htm?
[13 Aug 2007 1:36] tho su
Hi fellow Vietnamese,

I think we have done a good job to get the attention at MySQL development team and they are actively working to resolve the technical issue. We should not add any more of the "Please add Vietnamse support" on this thread. Please try to contribute rather than putting "pressure".

Regards,

TS
[17 Aug 2007 10:48] thu nguyen-van
Dear friends,
I would like to have a DB als MySQL with a new and good collation sequence for utf-8 fonts.
I thank very much to Mr Quan Nguyen and Mr Alexander Barkov.
Engineer Nguyen-van Thu
Bruxelles, Belgium
[17 Aug 2007 13:49] Peter Gulutzan
Of course 'a' <> 'ă', and of course 'a' <> 'â'.
But why do you say that 'a' <> 'à'?
Remember that we are trying to follow the
"Vietnamese Alphabetical System" rule.
It says "A<<\u00E0".
That is, according to our reading of this rule:
U+00E0 LATIN SMALL LETTER A WITH GRAVE
is the same as U+0041 LATIN SMALL LETTER A
at the primary level -- it only differs at the
secondary level. You are saying that this rule
is false, or you are saying that we misinterpret.
Are you sure?

But if it is true, then the collation strength should be 2.
I.e. the secondary level should be taken into account not
only for sorting, but for comparison as well.

I don't know if it is a good or bad news.

Possibly we can rewrite the tailoring by moving
the accent difference to the primary level instead
of the secondary level:

&A < 00E0 <<< 00C0
    < 1EA3 <<< 1EA2
    < 00E3 <<< 00C3
    < 00E1 <<< 00C1
    < 1EA1 <<< 1EA0

and so on for the other letters.

This will be not the same as
http://vietunicode.sourceforge.net/charset/v3.htm
but maybe it will do as a temporary solution
while we're working on WL#896
"primary, secondary and tertiary levels",
which will be visible soon on forge.mysql.com.
[18 Aug 2007 14:07] Quan Nguyen
Your reading is correct. At primary level, 'a' and 'à' are equal; but at secondary level, they are not, a << à, as an accent difference.

The same Vietnamese collation can also be expressed in a different way, as follows:

& ̀ << ̉ << ̃ << ́ << ̣
& A < ă <<< Ă < â <<< Â
& D < đ <<< Đ
& E < ê <<< Ê
& O < ô <<< Ô < ơ <<< Ơ
& U < ư <<< Ư

or

& \u0300 << \u0309 << \u0303 << \u0301 << \u0323
& A < \u0103 <<< \u0102 < \u00E2 <<< \u00C2
& D < \u0111 <<< \u0110
& E < \u00EA <<< \u00CA
& O< \u00F4 <<< \u00D4 < \u01A1 <<< \u01A0
& U< \u01B0 <<< \u01AF

If you altered the rule to get the sort correct for the time being, would you change it back to the correct form when you finish implementing WL#896?

On the other hand, you can put in the correct rule now and things will automatically fall into proper place with the implementation of WL#896. This way at least we get the correct sort at the primary level.
[24 Aug 2007 4:27] nguyễn thành phước
let's be me
[24 Aug 2007 15:09] Peter Gulutzan
"
Hello,

I asked earlier:
"You [ i.e. Quan Nguyen and tho su ] are saying that this rule
is false, or you are saying that we misinterpret.
Are you sure?"

Please look at Alexander Barkov's question again.

You said that 'à' should not be returned. Are you sure?
[27 Aug 2007 14:45] Duong Thien
Hi please fix this bug
[29 Aug 2007 2:27] Quan Nguyen
Hi Peter & Bar,

"It says 'A<<\u00E0'
That is, according to our reading of this rule:
U+00E0 LATIN SMALL LETTER A WITH GRAVE
is the same as U+0041 LATIN SMALL LETTER A
at the primary level -- it only differs at the
secondary level."

Your interpretation of the rule is correct. As such, the query

"SELECT letter FROM letters WHERE letter='a';

should return only two records 'a' and 'A'.

I am positive that is the correct result set -- unless MySQL database engine performs the compare operation (letter='a') at primary level, in which case the others would also be included.
[5 Sep 2007 12:15] Peter Gulutzan
Yes, of course MySQL will only use the primary
weight for a comparison operation involving WHERE.
So, for 'a' << 'à', you have a choice between 
'a' = 'à' and 'a' < 'à'. You chose 'a' < 'à'.

So you want these characters to be DISTINCT and
IN ORDER, as if they're separate letters of the
Vietnamese alphabet:
a à ả ã á ạ
and so on, for every case where
http://vietunicode.sourceforge.net/charset/vietalphabet.html
says that the difference is "<<" not "<<<".

It's strange, but it's not a problem. Would any
Vietnamese users like to try making the change by
themselves, or is our help still needed?

Incidentally, version 5.1.21 is now available for download.
[14 Sep 2007 2:59] Quan Nguyen
Changing the rules would go against both the official Vietnamese rule and Unicode Collation Algorithm.

Since MySQL uses primary weight in WHERE clause, the query should return 'a', 'à', 'ả', 'ã', 'á', and 'ạ' (and the corresponding capitals). Returning only 'a' and 'A' would be achieved by an overriding COLLATE clause (as described in MySQL 5.0 Reference Manual), if that is possible; or with a second rule specifically made for comparison operations.
[14 Sep 2007 17:11] tue nguyen
It's necessary to debug. Because in Viet Nam IT is growing rapidly, it's inconvenient for us to use My SQL-which is used popularly in our IT community with this bug. Thanks for attention.
[15 Sep 2007 23:16] Hai Phan
Hi, I want to second Quan's suggestions.

The primary weight should be the strictest, where 'a' <> 'á' <> 'à' <> 'ả' <> 'ã' <> 'ạ', because they're practically distinct letters in the Viet alphabet.  The words "con cú" and "con cu" have very different meanings and searching for one should not return the other also, the same way searching for "owl" in English should not return "penis."

However, in some search applications, it is convenient for the user not having to type in the accent mark.  This is where the secondary weight should be used, where
   'a' = 'á' = 'à' = 'ả' = 'ã' = 'ạ' =
   'ă' = 'ắ' = 'ằ' = 'ẳ' = 'ẵ' = 'ặ' =
   'â' = 'ấ' = 'ầ' = 'ẩ' = 'ẫ' = 'ậ'
as is the current behavior of the utf8_general_ci collation rule.  Only few applications need this accent-neutral effect, so it should not be the primary.  Also I want 'd' = 'đ' in this rule, right now it is not.

Regards,
Hai Phan
[20 Sep 2007 12:47] hiep le cong
Please add Vietnamese language collation for the ucs2 and utf8 Unicode character sets.
Pertinent information can be found
[21 Sep 2007 16:57] thanh lv
OK !
hope for this problem will be solved in the near future
[26 Sep 2007 8:35] AnhTuan 123
Please add Vietnamese collation in the next version. I love MySQL.
[30 Sep 2007 15:34] Quan Nguyen
Added a second Vietnamese collation for use with overriding COLLATE clause

Attachment: Index.xml (text/xml), 31.47 KiB.

[30 Sep 2007 15:56] Quan Nguyen
With the updated rules, assuming the table or the database is configured with utf8_vietnamese1_ci collation:

SELECT letter FROM letters WHERE letter='a';

would return 'a', 'à', 'ả', 'ã', 'á', 'ạ', and their capitals.

whereas the query

SELECT letter FROM letters WHERE letter='a' COLLATE utf8_vietnamese2_ci;

would return 'a' and its capital.
[2 Oct 2007 5:55] duc tran
I have same problems. Please add Viernamese collation in the next version of MySQL. I think most of web developers in Vietnam use MySQL. Thank you!
[5 Oct 2007 7:39] dung nguyen
I do know that it is very useful
[11 Oct 2007 2:12] Hoang Le Nhan
i'm vietnamese website. I want Mysql support vietnamese language for this app. 
thanks
[24 Oct 2007 2:34] nguyen van dai
It is in our todo list.
Unfortunately, MysQL now supports only primary weights in sorting and as far as we
understand secondary/tertiary weights are essencial for correct for Vietnamese collation.

Of course, the support for secondary/tertiary weights is planned, but as it would require
all our users to rebuild all their tables, it cannot be done in 4.1, but only in 5.0 or
5.1.

If the collation based on primary weights only would be enough for you we can do it
trivially and very fast (such a collation could be used either only for comparison or for
sorting, but not for both)
[30 Oct 2007 3:01] anh hoang
Hello i have a problem . I cann't add four collation in UTF8. I can add two new collation but add four collation mysql can't show all of four collations. It's only show two of them.
I Changed the Index.xml like Alexander told http://bugs.mysql.com/file.php?id=6814 .

Four ID collation i used: 211, 212,213,214
Mysql version: 5.1
[31 Oct 2007 6:08] Thanh Bui
Hi all,
I buil mysql-5.1.22-rc by VS 2005 from source code as guide in reference manual & http://bugs.mysql.com/bug.php?id=31217.
Then I use Quan Nguyen's Index.xml.
But:
 SELECT letter FROM letters WHERE letter='a' COLLATE utf8_vietnamese2_ci;
still return 'a', 'à', 'ả', 'ã', 'á', 'ạ', and their capitals
(= result from SELECT letter FROM letters WHERE letter='a';)
Can you tell me why???)
[8 Nov 2007 10:21] xh n
Please, 
fix some errors vietnamese font.
  Thanks.
[11 Nov 2007 21:06] Khamphouvieng Vongphachanh
Hello, I want you to help with Vietnamese character in MySQL
Please add Vietnamese collation for the usc2 and utf8 Unicode character sets
It will be easy and convenient for me to write Vietnamese characters
Thanks alot
[13 Nov 2007 19:47] Peter Gulutzan
In response to the question from [31 Oct 7:08] Thanh Bui:
This is the expected result, explanations are in earlier comments.
If you have difficulty understanding, try contacting Quan Nguyen.

In response to the other "requests" in the last month or two:
There is no need to ask constantly for a Vietnamese collation.
There is an answer. Please read earlier comments.
[18 Nov 2007 3:27] Quan Nguyen
Bar & Peter,

I moved the accent difference to the primary level for utf8_vietnamese2_ci and ucs2_vietnamese2_ci collations, as suggested:

&A < 00E0 <<< 00C0
    < 1EA3 <<< 1EA2
    < 00E3 <<< 00C3
    < 00E1 <<< 00C1
    < 1EA1 <<< 1EA0

However, the query still returned the same result set as the vietnamese1 versions. Can you explain? Thanks.

I'm running mysql-5.1.22 Release Candidate version.
[19 Nov 2007 9:45] Alexander Barkov
Quan, can you please attach your latest version of Index.xml ?
[20 Nov 2007 2:07] Quan Nguyen
Here we go.

Attachment: Index.xml (text/xml), 31.56 KiB.

[20 Nov 2007 5:49] Alexander Barkov
There's a mistake:

<collation name="utf8_vietnamese2_ci" id="212">
<!--Vietnamese experimental collation-->
	<rulep>

It should be <rules> instead.
[21 Nov 2007 1:24] Quan Nguyen
It works! Thank you very much. Hope the new collations will make it in the next releases of MySQL.
[26 Nov 2007 3:10] nguyen phuong
please , Mysql add langauge vietnamese !
[12 Dec 2007 0:53] Lương Vân
I need known mysql
[12 Jan 2008 3:25] Nguyễn Ngọc Tuân
I want to say make love . Would you like OK?
[24 Jan 2008 13:19] minh pham
Error:
Vietnamese Collation Chart
a
0061 	A
0041 	à
00E0 	À
00C0 	ả
1EA3 	Ả
1EA2 	ã
00E3 	Ã
00C3 	á
00E1 	Á
00C1 	ạ
1EA1 	Ạ
1EA0
ă
0103 	Ă
0102 	ằ
1EB1 	Ằ
1EB0 	ẳ
1EB3 	Ẳ
1EB2 	ẵ
1EB5 	Ẵ
1EB4 	ắ
1EAF 	Ắ
1EAE 	ặ
1EB7 	Ặ
1EB6
â
00E2 	Â
00C2 	ầ
1EA7 	Ầ
1EA6 	ẩ
1EA9 	Ẩ
1EA8 	ẫ
1EAB 	Ẫ
1EAA 	ấ
1EA5 	Ấ
1EA4 	ậ
1EAD 	Ậ
1EAC
b
0062 	B
0042
c
0063 	C
0043
d
0064 	D
0044
đ
0111 	Đ
0110
e
0065 	E
0045 	è
00E8 	È
00C8 	ẻ
1EBB 	Ẻ
1EBA 	ẽ
1EBD 	Ẽ
1EBC 	é
00E9 	É
00C9 	ẹ
1EB9 	Ẹ
1EB8
ê
00EA 	Ê
00CA 	ề
1EC1 	Ề
1EC0 	ể
1EC3 	Ể
1EC2 	ễ
1EC5 	Ễ
1EC4 	ế
1EBF 	Ế
1EBE 	ệ
1EC7 	Ệ
1EC6
g
0067 	G
0047
h
0068 	H
0048
i
0069 	I
0049 	ì
00EC 	Ì
00CC 	ỉ
1EC9 	Ỉ
1EC8 	ĩ
0129 	Ĩ
0128 	í
00ED 	Í
00CD 	ị
1ECB 	Ị
1ECA
k
006B 	K
004B
l
006C 	L
004C
m
006D 	M
004D
n
006E 	N
004E
o
006F 	O
004F 	ò
00F2 	Ò
00D2 	ỏ
1ECF 	Ỏ
1ECE 	õ
00F5 	Õ
00D5 	ó
00F3 	Ó
00D3 	ọ
1ECD 	Ọ
1ECC
ô
00F4 	Ô
00D4 	ồ
1ED3 	Ồ
1ED2 	ổ
1ED5 	Ổ
1ED4 	ỗ
1ED7 	Ỗ
1ED6 	ố
1ED1 	Ố
1ED0 	ộ
1ED9 	Ộ
1ED8
ơ
01A1 	Ơ
01A0 	ờ
1EDD 	Ờ
1EDC 	ở
1EDF 	Ở
1EDE 	ỡ
1EE1 	Ỡ
1EE0 	ớ
1EDB 	Ớ
1EDA 	ợ
1EE3 	Ợ
1EE2
p
0070 	P
0050
q
0071 	Q
0051
r
0072 	R
0052
s
0073 	S
0053
t
0074 	T
0054
u
0075 	U
0055 	ù
00F9 	Ù
00D9 	ủ
1EE7 	Ủ
1EE6 	ũ
0169 	Ũ
0168 	ú
00FA 	Ú
00DA 	ụ
1EE5 	Ụ
1EE4
ư
01B0 	Ư
01AF 	ừ
1EEB 	Ừ
1EEA 	ử
1EED 	Ử
1EEC 	ữ
1EEF 	Ữ
1EEE 	ứ
1EE9 	Ứ
1EE8 	ự
1EF1 	Ự
1EF0
v
0076 	V
0056
x
0078 	X
0058
y
0079 	Y
0059 	ỳ
1EF3 	Ỳ
1EF2 	ỷ
1EF7 	Ỷ
1EF6 	ỹ
1EF9 	Ỹ
1EF8 	ý
00FD 	Ý
00DD 	ỵ
1EF5 	Ỵ
[24 Jan 2008 15:27] Vu Nhat Chi Si
tôi muốn trở thành thành viên của diendantinhoc.net
[1 Feb 2008 4:29] Quan Nguyen
Just downloaded, installed the latest release 5.0.51a, and determined that it still does not have the Vietnamese collation incorporated.
[18 Feb 2008 5:33] minh le
Please support the collation in Vietnamese unicode. We really need it.

Thank you!
[4 Mar 2008 4:28] thien ngo
vietnamese colation in Mysql is very important and urgent. It affects on many vietnamese developer and yourself. If you don't fix this problem, you will lost your share in vietnam's market. Please fix it as soon as possible. I think It's very easy for you to fix in a short time!!!
[5 Mar 2008 17:19] Tran Vinh
Vietnamese language collation for the ucs2 and utf8 Unicode character sets is really necessary in building a website without some foolish errors.
Please fix it. Thanks you
[8 Mar 2008 11:32] Tuan Mac Duy
Please support the collation in Vietnamese unicode. We really need it.

Thank you!
[11 Mar 2008 3:23] Phạm Hưng
Deer MySql Team !
Now, I am working with PHP and MySql. And,I met this bug, so I hope MySql  Developers will fix this bug in next version.
Thanks a lot!
[19 Mar 2008 13:01] Susanne Ebrecht
Dear Vietnamese,

unfortunately there are more then 1000 spoken languages at the Earth. Vietnamese is only one of them. It's really hard to implement just the 400 most common languages.

For example, we also have lots of open issues with collation and Western European languages.

Peter and Alexander already gets information for Vietnamese collation here. We will implement this as soon as possible but we can't promise to do this at one of the next versions.

This doesn't mean that Vietnamese is less important then Western European languages, this just mean we have to do a lots of stuff at this topic.

Until we fix this issue you can help yourselves by adding your own collation.

Look here to get more informations:
http://forge.mysql.com/wiki/How_to_Add_a_Collation
http://forge.mysql.com/w/images/b/b7/HowToAddACollation.pdf
[22 Mar 2008 13:17] DINH KHUYEN DO
yes,thank you
[29 Mar 2008 12:42] Kent Willan
I have a few project with MySQL. So, I want to MySQL support collation Vietnamese.
Please help me (and different Vietnamese).
Thanks.
[6 Apr 2008 0:49] Thanh Sang
Please add Vietnamese language collation for the ucs2 and utf8 Unicode character sets.
[15 Apr 2008 15:07] ho dac quyen
I think MySQL's Developers Team should support Vietnamese Unicode (UTF-8) characters soon.
That will be wonderful.

Thank you very much.

Best regard,
Ho Dac Quyen
[11 Jun 2008 18:04] Trương Minh Tuyền
please, fix it!
[17 Jun 2008 8:28] Duy Hai Hoang
I'm a Vietnamese website developer and administrator using MySQL. When i use Vietnamese in my website and I want to search "đỉnh" (not Đỉnh) or arrange in alphabetical order it's not correct! The result are "đỉnh" and "Đỉnh". Please add Vietnamese language collation for the ucs2 and utf8 Unicode character sets.
[19 Jun 2008 15:30] le bich vien
hello ! my name is lebichvien
[26 Jun 2008 5:34] van thanh tran
please help us !
i love mysql
[10 Jul 2008 22:51] Peter Gulutzan
To all the people who have written comments recently:
We've shown how you can add a Vietnamese collation yourself.

Please read the comment by the bug reporter 8 months ago:
"
[21 Nov 2007 2:24] Quan Nguyen

It works! Thank you very much. Hope the new collations will make
it in the next releases of MySQL.
"
[15 Aug 2008 6:05] Tran Hai Linh
I'm Linh
I hopes MySQL fix this Error in near future.
[2 Sep 2008 8:11] Jean Christophe André
So, 4 years later, trying to summarize things...

We have a working solution since November 2007, thanks to Quan with Peter and Alexander help, which is non intrusive since it only requires addition to the /usr/share/mysql/charsets/Index.xml file.

I've just downloaded MySQL versions 5.0.51a and 5.1.26rc1 and none of them include this addition.

Is there still something else to do to help this enter the MySQL main stream?
[2 Sep 2008 8:28] Jean Christophe André
I would like to give a special answer to Susanne's last comment which I really find unfair to Vietnamese people...

First I can hardly understand how you can tell about hard work to implement 400 most common languages when I see this master file (Index.xml) having not been modified for 5 years (Copyright still at 2003)... :-/

Don't get me wrong, I'm pretty sure you are working hard at it! But may be you are just concentrating too much on a general universal (magical?) solution for every languages in the World, forgetting than meanwhile your customers really need it to work *right now*, straight after installation...

Secondly, according to some web sites statistics the Vietnamese language is the 16th most spoken language in the World, just two positions after French. See: http://www.photius.com/rankings/languages2.html

This is not surprising knowing it's population worldwide. And if you take a deep look in the FLOSS World, you'll find Vietnamese contributions in such major projects as the Linux kernel.

So, please, consider this request for Vietnamese support as a valuable one. Vietnamese people are really moving to FLOSS usage now and it really is the appropriate timing to show *your* support for their language and encourage them to use more of MySQL products and services!

Regards, J.C. (French FLOSS user since 1994, living in Vietnam since 1999).
[11 Sep 2008 9:47] Alexander Barkov
Jean, thank you very much for offering help!
We're very grateful to you and other people who
tested the Vietnamese collation. We have collected
feedback, and it seems everybody is happy with it.
We'll discuss shortly which version they can be added into.
Thanks!
[8 Oct 2008 0:29] Tuy Vuong
Hello Quan,
I tried this file http://vietunicode.sourceforge.net/howto/Index.xml and i got this error #1273 - Unknown collation: 'utf8_vietnamese1_ci' 
Can you help? Thanks
[22 Oct 2008 16:17] vu si hoi
Please help me!!!!
[19 Dec 2008 6:45] Hoang Tuan
My name is Tuan, a PHP&MySQL developer from Vietnam.
I wait for this 2 years ago. Please support Vietnam's developer community.

Many thank!
[22 Dec 2008 2:41] [ name withheld ]
Hi everybody !

I tried this file http://vietunicode.sourceforge.net/howto/Index.xml and i got this error

#1273 - Unknown collation: 'utf8_vietnamese1_ci' 

This bug report 5 years ago but why mySQL didn't support ?
[9 Jan 2009 9:52] Tai Nguyen
Please add Vietnamese collocation in your next version. Thanks!
[14 Jan 2009 5:46] duy nguyen
Vietnamese will be popular in the future, so pls add Vietnamese collation for the UCS2 and UTF8 Unicode character sets.
[24 Feb 2009 7:14] Dang Viet Chau
Hi Alexander Barkov & all,

We are a group to build a version of mysql to fully support vietnamese collation for ucs2&utf8 charset.

We read your instruction [HowtoAddACollation.pdf]. And currently, we really need an intruction to add a vietnamese build-in collation (To build from source code). 

Thanks you for your help,
[24 Feb 2009 14:18] Francesco Battaglia
I tried the workaround in a mysql community server with innodb e myisam and in a mysql cluster with ndbcluster

i tried to create this table

CREATE TABLE prova_vietnam (`ID` int(11) NOT NULL auto_increment,
vietnamita1 CHAR(20) CHARACTER SET utf8 COLLATE utf8_vietnamese1_ci,
vietnamita2 char(20) CHARACTER SET utf8 COLLATE utf8_vietnamese2_ci,
  PRIMARY KEY  (`ID`))ENGINE=NDBCLUSTER;

innodb and myisam is ok with ndbcluster won't

Error -> Can't create table 'prova.prova_vietnam' (errno: 140)

Could you explain me which is the reason?
[24 Feb 2009 14:23] Jean Christophe André
@Francesco
Are you sure your problem is related to Vietnameses collation?
Did you get any success with the same command but without Vietnamese collation?
I can hardly see how it is related…
[24 Feb 2009 14:26] Jean Christophe André
@Alexander
Any news since last september discussion?
[24 Feb 2009 14:33] Francesco Battaglia
CREATE TABLE prova_china (`ID` int(11) NOT NULL auto_increment,
cinese CHAR(20) CHARACTER SET big5 COLLATE big5_chinese_ci,
  PRIMARY KEY  (`ID`))ENGINE=NDBCLUSTER;

I tried this and this is ok
[24 Feb 2009 14:39] Francesco Battaglia
CREATE TABLE prova_vietnam (`ID` int(11) NOT NULL auto_increment, vietnamita1 CHAR(20), vietnamita2 char(20),
  PRIMARY KEY  (`ID`))ENGINE=NDBCLUSTER;

this is ok too!
[24 Feb 2009 14:45] Jean Christophe André
@Francesco
And does it work "with the same command but without Vietnamese collation" (like I've asked precisely)?

CREATE TABLE prova_vietnam (`ID` int(11) NOT NULL auto_increment,
vietnamita1 CHAR(20) CHARACTER SET utf8,
vietnamita2 char(20) CHARACTER SET utf8,
  PRIMARY KEY  (`ID`))ENGINE=NDBCLUSTER;
[24 Feb 2009 14:48] Francesco Battaglia
Yes it works!
[25 Feb 2009 10:34] Francesco Battaglia
So... what is the response?

Thanks
[25 Feb 2009 11:42] Jean Christophe André
@Francesco

Sorry but on my side I can't give you more answer… I'm not a MySQL developper, only a FLOSS guy living in Vietnam and trying to help to let the Vietnamese support be officially available in MySQL.

About your specific problem, I'm wondering if the collation code is the same in case of NDB backend… You already put the Vietnamese collation definition in your Index.xml, didn't you? I think so since you wrote it works with InnoDB and ISAM… So I can't see any explanation except the collation being managed differently for NDB databases…
[25 Feb 2009 11:54] Francesco Battaglia
Ok! Jean, thanks a lot for your help, 
I had tried to add the collation to my Index.xml but at the moment it doesn't work with ndb engine as you supposed!

Thanks in advance to all who will help me!
[27 Feb 2009 14:56] Peter Gulutzan
Responding to an earlier comment.

"[24 Feb 8:14] Dang Viet Chau

Hi Alexander Barkov & all,

We are a group to build a version of mysql to fully support vietnamese
collation for ucs2&utf8 charset.

We read your instruction [HowtoAddACollation.pdf]. And currently, we really
need an intruction to add a vietnamese build-in collation (To build from
source code).

Thanks you for your help,"

The latest instructions are here:
http://blogs.mysql.com/peterg/2008/05/19/instructions-for-adding-a-new-unicode-collation/
in section "ADDING A NEW COLLATION BY CHANGING THE SOURCE CODE".
We believe the instructions are still correct.
[24 Mar 2009 8:33] Francesco Battaglia
I tried the workaround to add vietnamese collation in mysql community server 5.1.31.

I created a db with this command:
CREATE DATABASE prova CHARACTER SET utf8 COLLATE utf8_vietnamese1_ci;

then I created a table:
create table prova (nome varchar(10))Engine=InnoDB;

when I try to insert a row longer then the field size for example:

insert into prova values ('01234567891');

I have this message:
Lost connection to MySQL server during query

The mysql server restart and close all connections giving this log:

/u01/app/mysql/bin/mysqld(my_print_stacktrace+0x21)[0x84d4341]
/u01/app/mysql/bin/mysqld(handle_segfault+0x381)[0x81fefa1]
/lib/tls/libpthread.so.0[0xd34a98]
/u01/app/mysql/bin/mysqld(_ZN15Field_varstring5storeEPKcjP15charset_info_st+0x1e
0)[0x81d9c30]
/u01/app/mysql/bin/mysqld(_ZN11Item_string13save_in_fieldEP5Fieldb+0x50)[0x81524
30]
/u01/app/mysql/bin/mysqld(_Z36fill_record_n_invoke_before_triggersP3THDPP5FieldR
4ListI4ItemEbP19Table_triggers_list14trg_event_type+0x4d)[0x824712d]
/u01/app/mysql/bin/mysqld(_Z12mysql_insertP3THDP10TABLE_LISTR4ListI4ItemERS3_IS5
_ES6_S6_15enum_duplicatesb+0xa50)[0x8282950]
/u01/app/mysql/bin/mysqld(_Z21mysql_execute_commandP3THD+0x1b19)[0x8210429]
/u01/app/mysql/bin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x340)[0x8215fa0]
/u01/app/mysql/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x11
e0)[0x8217190]
/u01/app/mysql/bin/mysqld(_Z10do_commandP3THD+0xe0)[0x8217a10]
/u01/app/mysql/bin/mysqld(handle_one_connection+0x233)[0x8208663]
/lib/tls/libpthread.so.0[0xd2e3cc]
/lib/tls/libc.so.6(__clone+0x5e)[0xba81ae]
090324  9:42:36 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8384512
read_buffer_size=131072
max_used_connections=6
max_threads=151
threads_connected=6
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 337721 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x9382228
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0xb16973a8 thread_stack 0x30000
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x938bbc8 = insert into prova values ('01234567891')
thd->thread_id=8
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
090324 09:42:36 mysqld_safe Number of processes running now: 0
090324 09:42:36 mysqld_safe mysqld restarted
InnoDB: Log scan progressed past the checkpoint lsn 0 42918596
090324  9:42:37  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 0 42918606
090324  9:42:37  InnoDB: Started; log sequence number 0 42918606
090324  9:42:37 [Note] Event Scheduler: Loaded 0 events
090324  9:42:37 [Note] /u01/app/mysql/bin/mysqld: ready for connections.
Version: '5.1.31'  socket: '/tmp/mysql.sock'  port: 3306  MySQL Community Server
 (GPL)

Any Idea???
[1 May 2009 13:09] Harry Dang
I'm a Vietnameses website developer and using MySQL, I have problem with sorting or searching with Vietnamese in MySQL. Hope, you can support collation for Vietnamese

Lots of thanks to the MySQL developer team.
[4 May 2009 2:38] Quoc Duong
Please fix this bug to support for Vietnamese.
I really need it soon.

Thank you!
[7 May 2009 9:12] Cloud Strife
The bug is not really fixed. These collations work well with a, e, i, o, d but u is not. And what about fulltext search? Does it work?
[10 Jun 2009 19:06] Hien Van
Người Việt tuy chỉ chừng 90 triệu dân, nhưng cũng đứng vào hàng G15 về ngôn ngữ trên thế giới và có sự đồng nhất cao. Tiếng Anh tuy đứng đầu nhưng tính đồng nhất thấp, có nhiều quốc gia lấy English làm ngôn ngữ hành chính nhà nước, còn dân chúng lại dùng ngôn ngữ và chữ viết riêng.
	Đúng MySQL không làm được điều mà chúng ta ( và cả MySQL ) mong muốn. Trong việc này chúng ta ( người Việt Nam ) chịu thiệt thòi nhiều nhất.
	Nói xa một chút, nếu Borland có hỗ trợ unicode thì tôi và nhiều người khác đả ưu tiên cho các sản phẩm của Enterprise ( Borland ). Đáng tiếc Borland đã không làm như Microsoft trước đây; Tôi kỳ vọng MySQL sẽ làm khác với Borland. Do ở đáy giếng nên tôi không biết ai có thể làm đối thủ với Microsoft, với MySQL, Oracle về một thị trường có 90 triệu dân : 90 triệu người với các quan hệ của họ trên thế giới.
	
	Còn về phía chúng ta ? Thực ra số ký tự riêng của tiếng Việt không nhiều : aáàảãạ / ăắằẳẵặ / âấầẩẫậ / eéèẻẽẹ / êếềểễệ / iíìỉĩị / oóòỏõọ / ơớờởỡợ / ôốồổỗộ / uúùủũụ / ưứừửữự / yýỳỷỹỵ và dđ ( 12 * 5 + 1 = 61 ký tự, đã bao gồm dấu / thanh điệu; còn nếu chia ra chữ cái và dấu số thì còn ít hơn nữa ). Trong VBA / Excel / Access ( không hỗ trợ unicode ) đã có người viết được: BỘ TIỆN ÍCH TRONG EXCEL, chẳng lẽ trong MySQL lại bó tay hay sao ? Trước đây trong DOS, VietRes đã làm được việc nhập xuất chữ Quốc ngữ rồi kia mà ! Chúng ta thiếu sự đồng thuận và nghị lực trong việc này.
	Đến hôm nay, dưới mắt MySQL / Microsoft / Oracle, Việt Nam là thị trường nhỏ so với nhiều Quốc gia khác ( dù những nước đó dân số ít hơn, chữ viết khó hơn Việt Nam ). Họ có nhiều thị trường khác, còn chúng ta sao không chung tay ( hoặc đơn lẻ ) làm việc này, mà ngồi chờ đến bao giờ, ôm cây đợi thỏ( , cho đến khi giàu có rồi mới làm ) ? Mà vì sao chiến dịch quảng bá cho Hạ Long ( dịch vụ du lịch ) lại quan trọng hơn chuyện này nhỉ ?
	Tôi mong các bạn giúp tôi chuyển ngữ ( tôi không viết được ). Cám ơn trước.
[26 Jun 2009 9:35] Francesco Battaglia
I had another problem with vietnamite collate defined in http://vietunicode.sourceforge.net/howto/Index.xml

see bug http://bugs.mysql.com/bug.php?id=45645&thanks=3&notify=199

at the moment mysql says that this is not a bug... mah...

But if you apply lower() function to a field created with vietnamese collate

mysql server closes all connections for all users and restarts...

To add a user defined collate seems to be very dangerous and seems to be not supported by mysql!

see comment by  Omer BarNir:

"Issue cannot be repeated with a MySQL supplied character set"

to try you can:

CREATE DATABASE prova CHARACTER SET utf8 COLLATE utf8_vietnamese1_ci;

create table prova (nome varchar(10))Engine=InnoDB;

insert into prova (nome) values ('hello!');

select * from prova where lower(nome)=lower('N');
[26 Jun 2009 10:04] Francesco Battaglia
"Mah" is an Italian word that stand for "I have no words" sorry... :)
[6 Jul 2009 8:44] Tom Doan
Hello,

I'm also a Vietnamese Web developer who has worked with MySQL in the past. I noticed that many people are having trouble with proper collation in regards to searching Vietnamese text. If you cannot wait for this bug to be fully fixed, then there is a solution to this! It's not a perfect solution, but it works right now. What you need to do to search Vietnamese text properly is (1) create a secondary table that contains the exact same Vietnamese text, BUT with all accent marks stripped off, and (2) write a search preprocessor that does translation for you before doing string comparison operations.

For example, let's say you have a Drupal or Joomla node that contains the text "người đàn ông trẻ tuổi"; this string would be saved to the main table unaltered (with all accent marks intact), but it would also be saved to a secondary table as "nguoi dan ong tre tuoi". When you output HTML, you would render from the main table so that accent marks are displayed; however, when you perform search operations, it would be off the secondary table. You have to assume that when a visitor searches your site they will enter "nguoi" or "người". If they enter "nguoi" then the search should be okay; if they search for "người" then the preprocessor intercepts the query, then translates "người" to "nguoi" before performing the search operation on the secondary table. When search result set is generated, again it would access the main table so that accent marks are shown.

This same solution can also be applied to sorting. As the saying goes, most problems in life can be solved with engineering :-). I hope everything made sense and was relevant to your problem(s).

Tom
[28 Jul 2009 4:07] Hoang Xuan Tam
Please add Vietnamese language collation for the ucs2 and utf8 Unicode character sets.
Pertinent information can be found at:

http://vietunicode.sourceforge.net/charset/vietalphabet.html
http://oss.software.ibm.com/cgi-bin/icu/lx/?d_=en_US&_=vi

How to repeat:
The Vietnamese collation is currently not supported.

thanks' a lot
[10 Aug 2009 7:16] Mai Thanh Duy
Please add Vietnamese language collation for the ucs2 and utf8 Unicode character sets.
Thanks!
[4 Mar 2010 18:47] tdchien tdchien
hope for this problem will be solved in the near future
[24 Apr 2010 1:25] Mr Hung
Please support Viet Nam Language in MySql. It need to milion web developmer in Viet Nam.
[1 May 2010 10:28] Hung Nguyen
Hello,
Is there any update until last comment?
[29 Jul 2010 13:39] Alexander Barkov
Vietnamese collation has been added into mysql-5.6.
[5 Sep 2010 9:40] Quyet Nguyen The
I hope you will fix it soon
thank more
[5 Oct 2010 7:42] Chuc Nguyen Van
Please add Vietnamese language collation for the ucs2 and utf8 Unicode character sets.
Thanks!
[17 Apr 2011 9:53] Nhu Quynh
Please Add Vietnamese collation for the ucs2 and utf8 Unicode character sets
Because it is necessary for us
Thanks alot
[17 Apr 2011 13:26] Peter Gulutzan
A Vietnamese collation has been available for years,
for people willing to take the trouble to modify the
distribution. The collation is now "built in" with
MySQL version 5.6, as we announced in our manual
http://dev.mysql.com/doc/refman/5.6/en/news-5-6-0.html
[23 May 2011 14:05] Valerii Kravchuk
Bug #61258 was marked as a duplicate of this one.