Bug #79306 Clarify how string literals are turned into JSON text or JSON path expressions
Submitted: 17 Nov 2015 9:30
Reporter: Knut Hatlen Email Updates:
Status: Analyzing Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.7.10 OS:Any
Assigned to: CPU Architecture:Any
Tags: json

[17 Nov 2015 9:30] Knut Hatlen
Description:
There has been some confusion around how escape sequences are handled in JSON text and JSON path expressions. See for example bug#79050 and bug#79235.

If the JSON document or the JSON path is specified as a string literal, the literal goes through the SQL parser before it gets to the JSON parser. Both the SQL parser and the JSON parser expand escape sequences, so even if the string literal looks like a valid JSON document or JSON path, the SQL parser may change it to something that isn't valid JSON before it gets to the JSON parser.

Take for example the JSON document {"a": "\u0031"}. If one calls JSON_EXTRACT('{"a": "\u0031"}', '$.a'), the SQL parser will expand '\u' to 'u', so the JSON function sees the document {"a": "u0031"}. The JSON parser will therefore not see an escape sequence, and will leave it unexpanded. This may come as a surprise.

If the SQL mode NO_BACKSLASH_ESCAPES is enabled, however, the backslash would go untouched through the SQL parser, and the JSON parser would see the full escape sequence \u0031, which would get expanded so that the JSON function sees the document {"a": "1"}.

Otherwise, backslashes in string literals need to be escaped to make it through to the JSON parser. In the example above, one would have to write JSON_EXTRACT('{"a": "\\u0031"}', '$.a').

How to repeat:
N/A

Suggested fix:
It might be good if the manual mentioned that the SQL parser could expand certain escape sequences before it gets to the JSON parser. There is a brief mentioning of this in the section that describes the JSON_UNQUOTE function. This could perhaps be moved to a more prominent place that could be referred to by the sections on JSON_UNQUOTE and the JSON Path Syntax.