Bug #79235 JSON_UNQUOTE() does not recognize unicode codes
Submitted: 11 Nov 2015 15:09 Modified: 17 Nov 2015 17:49
Reporter: Georgi Kodinov Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S2 (Serious)
Version:5.7+ OS:Any
Assigned to: Jon Stephens CPU Architecture:Any

[11 Nov 2015 15:09] Georgi Kodinov
Description:
http://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-unqu... says that "\uXXXX" should be recognized as "UTF-8 bytes for Unicode value XXXX".

In practice this does not happen. The rest of the escapes work, e.g. \t.

How to repeat:
mysql> select JSON_UNQUOTE('[ "\t\u0031" ]');
+--------------------------------+
| JSON_UNQUOTE('[ "\t\u0031" ]') |
+--------------------------------+
| [ "   u0031" ]                   |
+--------------------------------+
1 row in set (0.00 sec)

Suggested fix:
make it work ?
[11 Nov 2015 15:37] MySQL Verification Team
Thank you for the bug report.
[12 Nov 2015 11:50] Knut Anders Hatlen
I think this works as intended. The specification of JSON_UNQUOTE in WL#7909 says: "Returns textValue untouched if textValue does not start and end with double-quotes." Since the input string in the bug description does not start and end with double-quotes, it is correct to return it as it is. The documentation could be improved to state this explicitly.

Note that the expansion of \t in the example above is not performed by JSON_UNQUOTE, but rather of the SQL parser. Unless the NO_BACKSLASH_ESCAPES SQL mode is enabled, one needs to have double backslashes in the (SQL) string literal. The following works:

mysql> select JSON_UNQUOTE('"\\t\\u0031"');
+------------------------------+
| JSON_UNQUOTE('"\\t\\u0031"') |
+------------------------------+
| 	1                           |
+------------------------------+
1 row in set (0,00 sec)

Alternatively:

mysql> set sql_mode=NO_BACKSLASH_ESCAPES;
Query OK, 0 rows affected, 1 warning (0,00 sec)

mysql> select JSON_UNQUOTE('"\t\u0031"');
+----------------------------+
| JSON_UNQUOTE('"\t\u0031"') |
+----------------------------+
| 	1                         |
+----------------------------+
1 row in set (0,00 sec)
[12 Nov 2015 13:07] Dag Wanvik
Posted by developer:
 
Changing to documentation bug.
[17 Nov 2015 17:49] Jon Stephens
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly.