Bug #91621 Unit separator character (#1f) is not escaped in JSON returned by MySQL server
Submitted: 12 Jul 2018 19:33 Modified: 15 Jul 2018 6:17
Reporter: Kristopher Windsor Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: JSON Severity:S3 (Non-critical)
Version:5.7.18-15 OS:Any
Assigned to: CPU Architecture:Any
Tags: encoding, json

[12 Jul 2018 19:33] Kristopher Windsor
Description:
When a SELECT query is done for a JSON column, a JSON string is returned.
Control characters should be escaped (\uxxxx) in this JSON result instead of coming back as literal characters. Refer to the homepage of json.org and see that control characters should be escaped.

MySQL appears to escape all control characters except for the unit separator character (#1f in hex, ASCII character 31 in decimal). Refer to the below and see that this character has different treatment compared to the other control characters.

How to repeat:
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.18-15 |
+-----------+

mysql> CREATE TABLE `tablename` (
    ->   `id` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL,
    ->   `payload` json NOT NULL
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED;
Query OK, 0 rows affected (0.03 sec)

mysql> INSERT INTO tablename VALUES('26 substitute char', '{"key":"h\\u001ai"}');
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO tablename VALUES('27 escape char', '{"key":"h\\u001bi"}');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO tablename VALUES('28 file separator char', '{"key":"h\\u001ci"}');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO tablename VALUES('29 group separator char', '{"key":"h\\u001di"}');
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO tablename VALUES('30 record separator char', '{"key":"h\\u001ei"}');
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO tablename VALUES('31 unit separator char', '{"key":"h\\u001fi"}');
Query OK, 1 row affected (0.01 sec)

mysql> select * from tablename;
+--------------------------+---------------------+
| id                       | payload             |
+--------------------------+---------------------+
| 26 substitute char       | {"key": "h\u001ai"} |
| 27 escape char           | {"key": "h\u001bi"} |
| 28 file separator char   | {"key": "h\u001ci"} |
| 29 group separator char  | {"key": "h\u001di"} |
| 30 record separator char | {"key": "h\u001ei"} |
| 31 unit separator char   | {"key": "hi"}      |
+--------------------------+---------------------+
6 rows in set (0.00 sec)

Note: the unit separator character is non-printable, but we have verified (via other means) that there is actually a literal unit separator character in that last row above.

Suggested fix:
The unit separator character should be escaped in results just like the other control characters.
[13 Jul 2018 5:15] MySQL Verification Team
Hello Kristopher,

Thank you for the report.
With provided test case on 5.7.22 build I'm not seeing any issues. Also, are you using official MySQL build? Thank you!

-- 5.7.22
bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.22 MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create database test;
Query OK, 1 row affected (0.00 sec)

mysql> use test
Database changed
mysql>  CREATE TABLE `tablename` (
    ->    `id` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL,
    ->    `payload` json NOT NULL
    ->  ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED;
INSERT INTO tablename VALUES('31 unit separator char', '{"key":"h\\u001fi"}');
Query OK, 0 rows affected (0.12 sec)

mysql>
mysql>
mysql> INSERT INTO tablename VALUES('26 substitute char', '{"key":"h\\u001ai"}');
Query OK, 1 row affected (0.06 sec)

mysql> INSERT INTO tablename VALUES('27 escape char', '{"key":"h\\u001bi"}');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO tablename VALUES('28 file separator char', '{"key":"h\\u001ci"}');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO tablename VALUES('29 group separator char', '{"key":"h\\u001di"}');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO tablename VALUES('30 record separator char', '{"key":"h\\u001ei"}');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO tablename VALUES('31 unit separator char', '{"key":"h\\u001fi"}');
Query OK, 1 row affected (0.00 sec)

mysql> select * from tablename;
+--------------------------+---------------------+
| id                       | payload             |
+--------------------------+---------------------+
| 26 substitute char       | {"key": "h\u001ai"} |
| 27 escape char           | {"key": "h\u001bi"} |
| 28 file separator char   | {"key": "h\u001ci"} |
| 29 group separator char  | {"key": "h\u001di"} |
| 30 record separator char | {"key": "h\u001ei"} |
| 31 unit separator char   | {"key": "h\u001fi"} |
+--------------------------+---------------------+
6 rows in set (0.00 sec)

Thanks,
Umesh
[14 Jul 2018 0:53] Kristopher Windsor
Thanks, I found after some investigating that this is a duplicate of bug #87722 and was fixed in MySQL 5.7.21.
[15 Jul 2018 6:17] MySQL Verification Team
Thank you for confirming at your end!
Marking this as duplicate of Bug #87722.

Thanks,
Umesh