Description:
My test environment is running on a Pentium 4 with Windows XP SP2, IIS6.0, PHP 5.03, Postnuke.
1) I was looking at upgrading my version of mySQL from 4.0.20 to a newer version.
2) In my tests to upgrade, I backed up my PostNuke portal data using mySQL Administrator 1.0.19. I used character set UTF-8 as the character set for the PostNuke Site. The site had no problems receiving or displaying Korean characters in UTF8 mode.
3) Then, I installed 4.1.11. I used UTF8 as the default character set. Anyways, for some reason the Korean text did not restore properly from the backup .sql file and showed up as question marks on the browser. I have backed up and restored from the .sql files in 4.0.20 with no problems, but had no luck restoring the Korean Text data into 4.1.11. Interestingly, if you look at the restored data in mySQL Query Analyzer, it displays the Korean text correctly (The Korean text was: 아이구). However, when pulled from the database via PHP v5.03 and displayed to the browser it shows up as ???.
4) Aside from not restoring UTF8 encoded Korean text from mySQL v4.0 correctly, I tried inserting new Korean text into the database via the PHP PostNuke interface. This time I had only partial success. Some of the characters didn't map correctly.
아이구. 잘됄건지모라갯다.
became
아쿴구. 잘뿄건지모뿼갯다.
It basically appears that the character set mapping is off on some characters, particularly 이 and 라 in the sample above.
How to repeat:
1) To test out the restore, you'd have to install mySQL 4.0 and then create a table and place some Korean text into it.
2) Then back it up with mySQL Administrator.
3) Delete the schema, and restore it as a test.
4) Note that it works in 4.0.
5) Uninstall 4.0 and install 4.1 or move the file to a computer with 4.1 installed
6) Restore the database
7) Look at the Korean text via IE6.0 using a web server and script.
8) Note that Korean text appears as ????
TEST FOR KOREAN TEXT ENTERED INTO DB
To demonstrate that Korean text encoded in UTF 8 does not map quite right in mysql 4.1, I've included some php script.
//If you don't have a test database run the following SQL statements:
//Create Database Test;
//use test;
//CREATE TABLE tbl_test (strTEXT VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_bin);
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Test mysql's ability to handle Korean text using UTF 8</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<?php
//If you don't have a test database run the following SQL statements:
//Create Database Test;
//use test;
//CREATE TABLE tbl_test (strTEXT VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_bin);
// Connecting, selecting database
$link = mysql_connect('localhost', 'root', 'password')
or die('Could not connect: ' . mysql_error());
echo 'Connected successfully';
mysql_select_db('test') or die('Could not select database');
//Insert data
$query= "Insert into tbl_test (strText) values ('This is a test of UTF 8. 아이구. 잘됄건지모라갯다.')";
$result=mysql_query($query) or die('Query Failed: ' . mysql_error());
// Performing SQL query
$query = 'SELECT * FROM tbl_test';
$result = mysql_query($query) or die('Query failed: ' . mysql_error());
// Printing results in HTML
echo "<table>\n";
while ($line = mysql_fetch_array($result, MYSQL_ASSOC)) {
echo "\t<tr>\n";
foreach ($line as $col_value) {
echo "\t\t<td>$col_value</td>\n";
}
echo "\t</tr>\n";
}
echo "</table>\n";
// Free resultset
mysql_free_result($result);
// Closing connection
mysql_close($link);
?>
</body>
</html>