Bug #69089 mysqldump can't dump views with non-ASCII characters correctly
Submitted: 28 Apr 2013 6:37 Modified: 29 Apr 2013 12:43
Reporter: vin chen Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: mysqldump Command-line Client Severity:S3 (Non-critical)
Version:5.5 OS:Any
Assigned to: CPU Architecture:Any
Tags: mysqldump views

[28 Apr 2013 6:37] vin chen
Description:
we want to upgrade mysql from 5.0 to 5.5
But there are views with non-ascii characters

Using mysqldump in 5.5 dump from mysql 5.0, get dump.sql.
But dump.sql can't dump to mysql 5.5 correctly.

How to repeat:
# connect to mysql 5.0.67
set names gbk;

mysql> show variables like '%character_set%'; 
+--------------------------+--------------------------------------------------------------------+
| Variable_name            | Value                                                              |
+--------------------------+--------------------------------------------------------------------+
| character_set_client     | gbk                                                                |
| character_set_connection | gbk                                                                |
| character_set_database   | gbk                                                                |
| character_set_filesystem | binary                                                             |
| character_set_results    | gbk                                                                |
| character_set_server     | gbk                                                                |
| character_set_system     | utf8                                                               |
| character_sets_dir       | /usr/local/mysql-5.0.67-linux-x86_64-glibc23/share/mysql/charsets/ |
+--------------------------+--------------------------------------------------------------------+

use test;
create table t1(c1 int, c2 varchar(20)) engine = innodb;
insert into t1 values(1, '我们');
create view v1 as select c1, '中文', c2 from t1;

select * from v1;
+------+------+------+
| c1   | 中文 | c2   |
+------+------+------+
|    1 | 中文 | 我们 |
+------+------+------+
1 row in set (0.00 sec)

# using mysqldump in mysql 5.5.24
/usr/local/mysql55/bin/mysqldump -uroot -ppwd --skip-opt --default-character-set=gbk --set-charset --add-drop-table --no-create-db test t1 v1 > view.sql

#dump to mysql 5.5.24
mysql -uroot -ppwd --default-character-set=gbk test < view.sql

# connect to mysql 5.5.24
set names gbk;

mysql> show variables like '%character_set%';
+--------------------------+------------------------------------------------------+
| Variable_name            | Value                                                |
+--------------------------+------------------------------------------------------+
| character_set_client     | gbk                                                  |
| character_set_connection | gbk                                                  |
| character_set_database   | gbk                                                  |
| character_set_filesystem | binary                                               |
| character_set_results    | gbk                                                  |
| character_set_server     | gbk                                                  |
| character_set_system     | utf8                                                 |
| character_sets_dir       | /usr/local/mysql-5.5.24-linux-x86_64/share/charsets/ |
+--------------------------+------------------------------------------------------+
8 rows in set (0.00 sec)

mysql> select * from v1;     
+------+-------+------+
| c1   | 涓?枃 | c2   |
+------+-------+------+
|    1 | 中文  | 我们 |
+------+-------+------+
1 row in set (0.00 sec)

non-ASCII column names of views become messy code

Suggested fix:
definition of view v1 in dump file view.sql is

/*!50001 DROP TABLE IF EXISTS `v1`*/;
/*!50001 DROP VIEW IF EXISTS `v1`*/;
/*!50001 CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v1` AS select `t1`.`c1` AS `c1`,_gbk'中文' AS `涓枃`,`t1`.`c2` AS `c2` from `t1` */;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

should become

/*!50001 DROP TABLE IF EXISTS `v1`*/;
/*!50001 DROP VIEW IF EXISTS `v1`*/;
/*!40101 SET @saved_cs_client     = @@character_set_client */;    -- add
/*!40101 SET character_set_client = utf8 */;   -- add
/*!50001 CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v1` AS select `t1`.`c1` AS `c1`,_gbk'中文' AS `涓枃`,`t1`.`c2` AS `c2` from `t1` */;
/*!40101 SET character_set_client = @saved_cs_client */; -- add
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

Because column names of view always use utf8 (character_set_system), dump file should add "SET character_set_client = utf8" before create view statement, like create tempory table for views:

--
-- Temporary table structure for view `v1`
--

DROP TABLE IF EXISTS `v1`;
/*!50001 DROP VIEW IF EXISTS `v1`*/;
SET @saved_cs_client     = @@character_set_client;
SET character_set_client = utf8;
/*!50001 CREATE TABLE `v1` (
  `c1` int(11),
  `涓枃` varchar(2),
  `c2` varchar(20)
) ENGINE=MyISAM */;
SET character_set_client = @saved_cs_client;
[28 Apr 2013 6:42] vin chen
dump file from mysql 5.0

Attachment: view.sql (application/octet-stream, text), 1.91 KiB.

[29 Apr 2013 12:43] MySQL Verification Team
Thank you for the bug report. Verified as described.