Bug #34156 Thai language displaying as ????? from MySQL
Submitted: 30 Jan 2008 0:57 Modified: 17 Mar 2008 9:59
Reporter: Simon C Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:MySQL client version: 5.0.41 OS:MacOS (Leopard)
Assigned to: CPU Architecture:Any

[30 Jan 2008 0:57] Simon C
Description:
Using phpMyAdmin - 2.10.2 and MySQL client version: 5.0.41. I cannot get Thai language to display from MySQL, I have set everything to thai encoding and collation and tried UTF-8. But all I get in my web page is ????????????? where Thai characters should display. I'm sending the header

<?php header ("Content-type: text/html; charset=UTF-8");?> and have the meta http-enquiv as well.

How to repeat:
Create a DB and table set collation to UTF-8 and repeat again with TIS-620
[30 Jan 2008 0:59] Simon C
index_th.php

Attachment: index_th.php (text/php), 1.84 KiB.

[30 Jan 2008 5:26] Valeriy Kravchuk
Thank you for a problem report. Please, try to repeat with a newer version, 5.0.51a, and inform about the results. Can you see Thai characters properly from mysql command line client?
[1 Mar 2008 0:01] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[17 Mar 2008 9:59] Susanne Ebrecht
I'll set this to not a bug. Please feel free to open it again when you still have problems after following my advices that I will give you here now:

To handle character sets correct, you have to make sure of the following:

1) Your column should use the right character set. For example tis620
You can see it with:
mysql> show create table YOUR_TABLE_NAME;

2) for inserting data, your character_set_client variable and your input environment should have the same encoding.

For example:
Make sure that your terminal is set to TIS-620, if the character_set_client=tis620.
Now, you can insert the data.

For example: if your terminal is utf8 and your column has tis620.
Then make first:
mysql> set names utf8;
This occurs that character_set_client and some other variables will set to utf8.
Now you can insert the data by using utf8 and the system will handle, that they will changed from utf8 to tis620. Which means, they are stored by using tis620 at the table.

If your data are stored correct at the database you can check with length().
Tis620 is using one byte for characters and utf8 is using 2 bytes for characters.
Insert a single character in your textfield and then use:
select length(COL) from TAB where PK=X; Or another where clause that won't give you the length from the column of all rows. If you only have one rows you don't need a where clause.

If your column use tis620 and your length() is 2 then the data are stored wrong in your database. If your column use utf8 and the length is 1 or 4 then the data are stored wrong at the database too.

3) for selecting data, it's the same as for inserting data.
Look, which encoding your output environment need. Set the environment to TIS-620 or use:
set names CHARACTER_SET_OF_YOUR_OUTPUT_ENVIRONMENT

The difference between input and output are, that you need other variables. 
For example: you need character_set_results for output but it's not necessary for input.
"Set names" will set all necessary variables to the right values.

By using this rules, you can be sure, your data are stored in the right way at the database and you won't get problems.

If you get a weird output by using this rules, you can be sure, your stored data were stored in the wrong way. For repairing this, it's necessary to dump the database, change the wrong data manually at the dump and import the dump again.

Summary:
Make sure that environment encoding and character_set_client have the same encoding/character set.
For selecting data also make sure, that environment encoding, character_set_client, character_set_result have the same encoding/character set.
Switch your MySQL system encoding to your environment encoding by using:
"set names MY_ENVIRONMENT". Or switch your environment encoding to your system encoding.