Bug #79051 Syntax of JSON path expressions should be documented in detail
Submitted: 1 Nov 2015 1:21 Modified: 4 Nov 2015 20:55
Reporter: Roland Bouman Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S2 (Serious)
Version: OS:Any
Assigned to: Jon Stephens CPU Architecture:Any

[1 Nov 2015 1:21] Roland Bouman
Description:
The path argument is a recurring theme in many of the JSON functions. Yet there seems to be no single piece of documentation that clearly describes the syntax of JSON paths in MySQL.

For example, https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html mentions "the * and ** wildcard" several times, but I cannot locate any documentation that explains what they do and where they may appear. 

The page https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-extract contains an example of the * wildcard (but not the ** wildcard) but does not explain only show.

Things I observed that are - AFAICS - document nowhere:

1) The * wildcard works for properties too, not just array access:

mysql> SELECT JSON_EXTRACT('{"name1": "boe", "name2": [1,2]}', '$.*');
+---------------------------------------------------------+
| JSON_EXTRACT('{"name1": "boe", "name2": [1,2]}', '$.*') |
+---------------------------------------------------------+
| ["boe", [1, 2]]                                         |
+---------------------------------------------------------+
1 row in set (0.00 sec)

2) Although mentioned, the ** wildcard simply doesn't seem to exist:

mysql> SELECT JSON_EXTRACT('{"name1": "boe", "name2": [1,2]}', '$.**');
ERROR 3143 (42000): Invalid JSON path expression. The error is around character position 4 in '$.**'.

3) Propery names in path must be double quoted if the property identifier contains interpunction (spaces, special characters, meta characters), or if the name starts with a digit:

mysql> SELECT JSON_EXTRACT('{"name 1": "boe", "name 2": [1,2]}', '$."name 1"');
+------------------------------------------------------------------+
| JSON_EXTRACT('{"name 1": "boe", "name 2": [1,2]}', '$."name 1"') |
+------------------------------------------------------------------+
| "boe"                                                            |
+------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT JSON_EXTRACT('{"name 1": "boe", "name 2": [1,2]}', '$.name 1');
ERROR 3143 (42000): Invalid JSON path expression. The error is around character position 7 in '$.name 1'.

How to repeat:
N/A

Suggested fix:
Please provide complete documentation for the path argument.

I arrived at the following EBNF:

mysql-json-path         ::= Document-placeholder path-expression?
  Document-placeholder    ::= '$'
  path-expression         ::= path-component path-expression*
  path-component          ::= property-accessor | array-accessor
  property-accessor       ::= '.' property-identifier
  property-identifier     ::= Simple-property-name | quoted-property-name | wildcard-identifier
  Simple-property-name    ::= <Please refer to JavaScript, The Definitive Guide, 2.7. Identifiers>
  quoted-property-name    ::= '"' string-content* '"'
  string-content          ::= Non-quote-character | Escaped-quote-character
  Non-quote-character     ::= <Any character but the double quote character (")>
  Escaped-quote-character ::= '\"'
  wildcard-identifier     ::= '*'
  array-accessor          ::= '[' element-identifier ']'
  element-identifier      ::= [0-9]+ | wildcard-identifier
[1 Nov 2015 10:41] MySQL Verification Team
Hello Roland,

Thank you for the report.

Thanks,
Umesh
[4 Nov 2015 20:55] Jon Stephens
Fixed by adding a new section with BNF and examples to the documentation. The new section should appear soon at https://dev.mysql.com/doc/refman/5.7/en/json-path-syntax.html and elsewhere in the 5.7 Manual.

Thanks again for all your help with this. I only started handling docs for the JSON stuff fairly recently, so it's been quite educational. :)