Bug #85755 JSON containing null value is extracted as a string "null"
Submitted: 1 Apr 2017 0:53 Modified: 22 Jan 22:26
Reporter: Dave Pullin (Basic Quality Contributor) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: JSON Severity:S2 (Serious)
Version:5.7.17 OS:CentOS (centos 7)
Assigned to: CPU Architecture:Any

[1 Apr 2017 0:53] Dave Pullin
Description:
Wrapping a null value in a JSON, and then extracting it yields the 4 charcaters string "null", not a null value.

select json_extract(json_object( "a",null),'$.a');
+--------------------------------------------+
| json_extract(json_object( "a",null),'$.a') |
+--------------------------------------------+
| null                                       |
+--------------------------------------------+
1 row in set (0.00 sec)

*Looks* right, but it is not the null value. 

select length(json_extract(json_object( "a",NULL),'$.a'));
+----------------------------------------------------+
| length(json_extract(json_object( "a",NULL),'$.a')) |
+----------------------------------------------------+
|                                                  4 |
+----------------------------------------------------+
1 row in set (0.00 sec)

select ifnull(json_extract(json_object( "a",NULL),'$.a'),'it is null');
+-----------------------------------------------------------------+
| ifnull(json_extract(json_object( "a",NULL),'$.a'),'it is null') |
+-----------------------------------------------------------------+
| null                                                            |
+-----------------------------------------------------------------+
1 row in set (0.00 sec)

How to repeat:
select json_extract(json_object( "a",null),'$.a');

Suggested fix:

preserve the null value.
[3 Apr 2017 12:50] Umesh Shastry
Hello Dave Pullin,

Thank you for the report.

Thanks,
Umesh
[10 Apr 2017 7:09] Knut Anders Hatlen
Posted by developer:
 
Hi Dave,

I think what you are seeing is the intended behaviour.

JSON_OBJECT('a', NULL) evaluates to the JSON document { "a" : null }. Here, JSON_OBJECT has converted the SQL NULL value to the JSON null literal, which are different in SQL. When you call JSON_EXTRACT with path $.a on { "a" : null }, it extracts the JSON null literal from that document. If JSON_EXTRACT had not found anything on the specified path, it would have returned SQL NULL instead. So the difference between SQL NULL and the JSON null literal allows you do distinguish between the case where JSON_EXTRACT found a (JSON) null value and the case where it didn't find anything.

I didn't find any mentioning of how JSON_OBJECT handles NULL values in the reference manual, only in the design documents for WL#7909 which added the JSON_OBJECT function. Unless I have overlooked something in the manual, maybe what we should do is clarify how JSON_OBJECT handles NULL arguments in the manual.
[10 Apr 2017 13:27] Dave Pullin
I fear you may be correct - it is "working as designed".

I submit that this is a design error. Failure to round-trip values is encouraging obscure and difficult to handle errors.

JSON_CONTAINS should be used to determine if the object contains the path.
(while there appears to be no good way to determine if JSON_extract returns json null.)
[9 May 2017 10:26] Knut Anders Hatlen
Hi Dave,

Just to clarify the current behaviour of JSON_EXTRACT and null values:

1. When the input value is SQL NULL, the return value is SQL NULL:

SELECT JSON_EXTRACT(NULL, '$'), JSON_EXTRACT(NULL, '$') IS NULL
-> (NULL, 1)

2. When the path does not exist in the input value, the return value is SQL NULL:

SELECT JSON_EXTRACT('{"a":1}', '$.b'), JSON_EXTRACT('{"a":1}', '$.b') IS NULL
-> (NULL, 1)

3. When the path exists, and it contains the JSON null literal, the return value is the JSON null literal, which is not SQL NULL:

SELECT JSON_EXTRACT('null', '$'), JSON_EXTRACT('null', '$') IS NULL
-> (null, 0)

SELECT JSON_EXTRACT('{"a":null}', '$.a'), JSON_EXTRACT('{"a":null}', '$.a') IS NULL
-> (null, 0)

Which of these do you want to change, and what do you want them to return?
[9 May 2017 21:41] Dave Pullin
I understand your dilemma. Perhaps json_extract is not the right place to fix the problem. The core problem is that there ought to way that any value can be put into a JSON and retrieved as the original value: ie. some function Y such that 

Y(json_extract(json_object("a",x),'$.a') <=> x

is true for all x.

I figured that json_unquote() was the function Y(), or more specifically, that
 json_column->>'$.a' would do the job.

Unfortunately I 'simplified out' the JSON_UNQUOTE in my bug report.

My expectation was that
  
 insert (json_column) values(x) 
then
 json_column->>'$.a' <=> x
would be true for all x. But it is not true when x is null.

In which case the fix may be to make json_unquote(json_null) return SQL null

Similarly json_unquote(json_literal_boolean) should return SQL boolean

The underlying problem may be that MySQL isn't treating the JSON literals as entities distinct from their written forms, but as character strings equal to the written forms.

note that
select length(false),length(true),length(null)

,length(json_extract(json_object('a',false),'$.a'))
,length(json_extract(json_object('a',true),'$.a'))
,length(json_extract(json_object('a',null),'$.a'))

,length(json_unquote(json_extract(json_object('a',false),'$.a')))
,length(json_unquote(json_extract(json_object('a',true),'$.a')))
,length(json_unquote(json_extract(json_object('a',null),'$.a')))
;

-->  ( 1,1,null, 5,4,4, 5,4,4)
[10 Oct 2017 10:45] Sergio Nalin
Hi,

while it would at this point introduce a backward incompatibility, I second Dave Pullin's proposal that json_unquote should return SQL values whenever possible.

It's quite unexpected, baffling and error-prone for:

select if(json_unquote(json_extract(cast('{"id": null}' as JSON), '$.id')) is null, "is null", "is not null")

to produce "is not null" (same goes for boolean values), while equivalent statements for strings and numbers behave as expected:

select if(json_unquote(json_extract(cast('{"id": "blabla"}' as JSON), '$.id')) = "blabla", "is blabla", "is not blabla")
select if(json_unquote(json_extract(cast('{"id": 10}' as JSON), '$.id')) = 10, "is 10", "is not 10")

The ability to discriminate between "value is null" and "value is undefined", while still possible using just json_extract, is indeed better served by json_contains (or, possibly, by introducing an UNDEFINED mysql value, although that might have far reaching implications).
[21 Nov 2018 13:20] Umesh Shastry
Bug #93167 marked as duplicate of this one
[22 Jan 16:35] Jordi Boggiano
To add some examples here. I have had this use case (filter things or retrieve things based on a json property that can sometimes be null and sometimes be unset) and these are the best ways I found to work around it:

To filter:

    SELECT * FROM xx WHERE JSON_TYPE(JSON_EXTRACT(attributes, "$.property")) != "NULL"

To query:

    SELECT IF(JSON_TYPE(JSON_EXTRACT(attributes, "$.property"))="NULL", NULL, attributes->>"$.property") AS property FROM xx

As you can see, this is rather long for what I initially expected I could write as (combining both filter and reading out the value here in one query, which is almost shorter than either of the above):

    SELECT attributes->>"$.property" FROM xx WHERE attributes->>"$.property" IS NOT NULL

I understand that changing the semantics of ->> at this point might be harmful. Adding a new way to express this shorter though would be very welcome. The current workaround is verbose, hard to read and error-prone.

Here is an example, where JSON_DECODE would work as an alternative to JSON_EXTRACT, but which would return a mysql NULL if it reads a json null.

    SELECT JSON_DECODE(attributes, "$.property") FROM xx WHERE JSON_DECODE(attributes, "$.property") IS NOT NULL

Thanks for considering this as a possible improvement.
[22 Jan 22:26] Dave Pullin
Json_deCode() seems like a good idea.

Perhaps paired with json_code() such that

Json_decode(json_code(x))<=>x

For any x database column value or SQL expression.