Bug #116826 default-character-set fails to set the character set in mysqldump correctly
Submitted: 29 Nov 8:02 Modified: 29 Nov 13:40
Reporter: zengyu ma Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: mysqldump Command-line Client Severity:S3 (Non-critical)
Version:mysqldump Ver 8.0.40 for Linux on x86_6 OS:CentOS (CentOS Linux release 7.9.2009 (Core))
Assigned to: CPU Architecture:x86

[29 Nov 8:02] zengyu ma
Description:
I created a table under the latin1 character set and set Chinese in the remarks of the table structure.In mysqldump, default-character-set=latin1 is set, but garbled characters are found in the exported sql file.

How to repeat:
一、Create a table in the test database
root@localhost [test]> show create database test;
+----------+--------------------------------------------------------------------------------------------------------------------------------+
| Database | Create Database                                                                                                                |
+----------+--------------------------------------------------------------------------------------------------------------------------------+
| test     | CREATE DATABASE `test` /*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci */ /*!80016 DEFAULT ENCRYPTION='N' */ |
+----------+--------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
root@localhost [test]> set names latin1;
Query OK, 0 rows affected (0.00 sec)
root@localhost [test]> create table la(id int auto_increment primary key comment '主键',name varchar(5)) DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.03 sec)

root@localhost [test]> show create table la;
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                           |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| la    | CREATE TABLE `la` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT '主键',
  `name` varchar(5) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1   |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
二、Set the session character set to UTF8MB4 to view the table structure
root@localhost [test]> set names utf8mb4;
Query OK, 0 rows affected (0.00 sec)

root@localhost [test]> show create table la;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                  |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| la    | CREATE TABLE `la` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT '主键',
  `name` varchar(5) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1        |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
root@localhost [test]> set names latin1;
Query OK, 0 rows affected (0.00 sec)
三、Use mysqldump to export the table schema
[mysql@localhost ~]$ mysqldump --login-path=root --source-data=2 --single-transaction --set-gtid-purged=OFF --default-character-set=latin1 -d -B test --tables la --result-file create.sql
[mysql@localhost ~]$ cat create.sql
-- MySQL dump 10.13  Distrib 8.0.40, for Linux (x86_64)
--
-- Host: localhost    Database: test
-- ------------------------------------------------------
-- Server version       8.0.40

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!50503 SET NAMES latin1 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Position to start replication or point-in-time recovery from
--

-- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000007', MASTER_LOG_POS=947;

--
-- Table structure for table `la`
--

DROP TABLE IF EXISTS `la`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!50503 SET character_set_client = utf8mb4 */;
CREATE TABLE `la` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT '主键',
  `name` varchar(5) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2024-11-29 23:58:24

It seems that the --default-character-set=latin1 set does not take effect
[29 Nov 10:45] MySQL Verification Team
Hi Mr. Zengyu,

Thank you for your bug report.

However, this is not a bug.

Character set Latin1 is a single-byte charset and it can not accommodate Chinese ideograms. Chinese ideograms require character sets that can hold up to 3 (three) bytes. That is a part of the Unicode Standard and MySQL follows that standard strictly.

Not a bug.
[29 Nov 13:40] zengyu ma
Thank you for reminding me, but I am confused that the comments in the table structure are not displayed properly, but the data in the table is displayed correctly

[mysql@localhost ~]$ mysqldump --login-path=root --quick --extended-insert --default-character-set=latin1 --hex-blob --master-data=2 --single-transaction --set-gtid-purged=OFF -B test --tables la>data.sql
WARNING: --master-data is deprecated and will be removed in a future version. Use --source-data instead.
[mysql@localhost ~]$ cat data.sql
-- MySQL dump 10.13  Distrib 8.0.40, for Linux (x86_64)
--
-- Host: localhost    Database: test
-- ------------------------------------------------------
-- Server version       8.0.40

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!50503 SET NAMES latin1 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Position to start replication or point-in-time recovery from
--

-- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=1341;

--
-- Table structure for table `la`
--

DROP TABLE IF EXISTS `la`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!50503 SET character_set_client = utf8mb4 */;
CREATE TABLE `la` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT '主键',
  `name` varchar(5) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `la`
--

LOCK TABLES `la` WRITE;
/*!40000 ALTER TABLE `la` DISABLE KEYS */;
INSERT INTO `la` VALUES (1,'a'),(2,'中');
/*!40000 ALTER TABLE `la` ENABLE KEYS */;
UNLOCK TABLES;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2024-11-29 21:22:27
[29 Nov 13:47] MySQL Verification Team
Hi Mr. Zengyu,

Thank you for your bug report.

However, this is not a bug.

Character set Latin1 is a single-byte charset and it can not accommodate Chinese ideograms. Chinese ideograms require character sets that can hold up to 3 (three) bytes. That is a part of the Unicode Standard and MySQL follows that standard strictly.

Not a bug.