Description:
When creating a table with a DEFAULT for a utf8mb4 column which uses a 4 byte encoded unicode character, the output is corrupted and the character replaced with a ? in SHOW CREATE TABLE.
How to repeat:
CREATE TABLE `encodings` (
`id` bigint NOT NULL AUTO_INCREMENT ,
`a` varchar(100) CHARACTER SET ascii DEFAULT 'foo',
`l` varchar(100) CHARACTER SET latin1 DEFAULT 'føö',
`u` varchar(100) CHARACTER SET utf8mb4 DEFAULT 'føö√🙈',
PRIMARY KEY (`id`)
);
SHOW CREATE TABLE encodings\G
*************************** 1. row ***************************
Table: encodings
Create Table: CREATE TABLE `encodings` (
`id` bigint NOT NULL AUTO_INCREMENT,
`a` varchar(100) CHARACTER SET ascii COLLATE ascii_general_ci DEFAULT 'foo',
`l` varchar(100) CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT 'føö',
`u` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT 'føö√?',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)
Suggested fix:
It looks like MySQL internally still uses the utf8mb3 encoding to print the DEFAULT value. Unicode characters which encode to 3 bytes work correctly but 4 byte characters like emojis break.