Bug #26557 non latin character support
Submitted: 22 Feb 2007 9:29 Modified: 22 Feb 2007 10:02
Reporter: Colonel Panik Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:4.1+ maybe all... OS:Any (all)
Assigned to: CPU Architecture:Any
Tags: asian, charset, japanese, mojibake, utf

[22 Feb 2007 9:29] Colonel Panik
Description:
This Bug entry relates (I hope) to many others posted on the MySQL Bugs list.
The solution I found may not apply to all cases but I am posting here to try and be helpful.
The problem and solution seems to relate to issues with both mySQL and phpMyAdmin, as such this report is being submitted to both companies.

Database tables containing non roman characters sets (particularly UTF-8) cannot be copied using outputted .sql files.
Often the SQL parser will reach characters it cannot understand and is either terminating execution or interpreting those characters as string termination markers.

How to repeat:
Steps to replicate (in my case)
Create a sizeable database with kanji data (ours had some 50,000 rows and 60 tables) e.g. 

CREATE TABLE `kanji_test` (
  `id` int(16) NOT NULL default '0',
  `english` varchar(100) default NULL,
  `japanese` varchar(100) default NULL,
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO `kanji_test` VALUES (0, 'English text', '日本語の文字');

Using mySQL cmd line or phpMyAdmin "export" save the structre and data of the database to .sql file.
IMPORT/post file as SQL in phpMyAdmin or from mySQL command line.
There probably needs to be more than 1 record to replicate the bug.

The character set used is many cases is latin1_ci_swedish (the mySQL default)

Suggested fix:
Solution:
The .sql files need to be exported then edited in stages prior to import.

1) export only the structure of the database
2) edit resultant files and remove whatever you have where my statement reads "ENGINE=MyISAM DEFAULT CHARSET=latin1"
So that the CREATE TABLE() command alone is present.
3) import the structure
________________________________________________________________________________
4) export tables individually, or if they are small in groups of 3/4, 
--resulting SQL files slow significantly after 3mb/5000 rows and stall or produce errors at larger sizes.
5) import using 
%>mysql -u <user> -p test_db < test01.sql
%>mysql -u <user> -p test_db < test02.sql   e.t.c...
or phpMyAdmin through "IMPORT" tab if available or "SQL"(as file) tab if not.

N.B. in doing this to our main database I had to delete a few INSERT statements that were reported as duplicate rows on line 1, they possibly were although their being duplicates on line 1 is curious. 
However removal worked and those lines, for us, were test data so we got our database across in tact.

issue may turn out to be connected to: 
using files containing both structure and data
using large files
extra data parsed with the CREATE TABLE() command being malformatted(?)

This information has been supplied to both mySQL and phpMyAdmin bugs lists, good luck guys :o)
[22 Feb 2007 10:02] Sveta Smirnova
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://dev.mysql.com/doc/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

Please read about Character Set Support at http://dev.mysql.com/doc/refman/4.1/en/charset.html