Bug #99237 Filtering using JSON field, IS TRUE does not work
Submitted: 13 Apr 2020 10:03 Modified: 28 May 2020 23:10
Reporter: Marcel Gerber Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: JSON Severity:S2 (Serious)
Version:8.0.19, 8.0.18, 8.0.17 OS:Windows
Assigned to: CPU Architecture:Any
Tags: regression

[13 Apr 2020 10:03] Marcel Gerber
Description:
When I try to filter by a Boolean JSON field in the WHERE clause and using "IS TRUE", I get back rows that are both TRUE and FALSE. The repro steps should make that clearer.

This works as expected for MySQL Community Server 5.7.29, and is broken on 8.0.19.

How to repeat:
Create a table as such:

CREATE TABLE test (id INT, config JSON);
INSERT INTO test VALUES (1, '{"hasMapData": true}');
INSERT INTO test VALUES (2, '{"hasMapData": false}');

This yields the following table test:
+------+-----------------------+
| id   | config                |
+------+-----------------------+
|    1 | {"hasMapData": true}  |
|    2 | {"hasMapData": false} |
+------+-----------------------+
2 rows in set (0.00 sec)

Then select from it like this:

SELECT id, config -> "$.hasMapData"
FROM test
WHERE config -> "$.hasMapData" IS TRUE;

Expected output:
+------+--------------------------+
| id   | config -> "$.hasMapData" |
+------+--------------------------+
|    1 | true                     |
+------+--------------------------+
1 row in set (0.00 sec)

Actual output:
+------+--------------------------+
| id   | config -> "$.hasMapData" |
+------+--------------------------+
|    1 | true                     |
|    2 | false                    |
+------+--------------------------+
2 rows in set (0.00 sec)
[13 Apr 2020 10:25] MySQL Verification Team
Hello Marcel Gerber,

Thank you for the report and test case.

regards,
Umesh
[23 Apr 2020 17:50] Guilhem Bichot
Hello Marcel.
Yes, it works for IS TRUE/FALSE in 5.7, but not quite.
if I add this row to the table and test against MySQL 5.7:
 INSERT INTO test VALUES (3, '{"hasMapData": null}');
SELECT id, (config -> "$.hasMapData") is false FROM test ;
+------+-------------------------------------+
| id   | (config -> "$.hasMapData") is false |
+------+-------------------------------------+
|    1 |                                   0 |
|    2 |                                   1 |
|    3 |                                   1 |
+------+-------------------------------------+
See, 'null' is said to be false, while it's not.
And with NOT:
SELECT id, not(config -> "$.hasMapData") FROM test ;
+------+-------------------------------+
| id   | not(config -> "$.hasMapData") |
+------+-------------------------------+
|    1 |                             0 |
|    2 |                             1 |
|    3 |                             1 |
+------+-------------------------------+
while one could have hoped that NOT(null) would be NULL, not 1.

Actually it's a bit complicated to know what the right behaviour should be with JSON literals true/false/null vs IS TRUE/FALSE/NULL.
To many users the right behaviour is "to automatically convert JSON literals to SQL TRUE/FALSE/NULL", but when one reads the SQL standard (which covers JSON, and I spent a few hours today going through it), that is not the case. So we need to be very careful.

Here's a work-around until we decide what's correct: convert your JSON data to a real string and compare with "true":
mysql> SELECT id, (config ->> "$.hasMapData")="true" FROM test ;
+------+------------------------------------+
| id   | (config ->> "$.hasMapData")="true" |
+------+------------------------------------+
|    1 |                                  1 |
|    2 |                                  0 |
|    3 |                                  0 |
+------+------------------------------------+

Note the "->>" instead of "->", so it does an implicit JSON_UNQUOTE, which makes the JSON data into a real string.
Can also compare with "false" or "null".
I hope it helps.
[23 Apr 2020 17:59] Guilhem Bichot
See also, for JSON "null" vs NULL: http://bugs.mysql.com/bug.php?id=85755
[23 Apr 2020 18:20] Marcel Gerber
Thank you very much Guilhem for looking into it and diving into the standard.

The issue came up because our existing code uses exactly this kind of 'IS TRUE' and worked just fine in 5.7, whereas it didn't with 8.0. The case of NULL fields never came up, but I see the problem there.

Thank you very much for providing a workaround solution, maybe we will incorporate it. The one issue I see there is that it's not easy to distinguish between {a: true} and {a: "true"} with that (which IS TRUE should take care of, I guess?), but that's not really a problem for us.

Thank you!
[23 Apr 2020 20:11] Guilhem Bichot
You're right,
true vs "true"
are impossible to distinguish if "->>" is used; for that, one could use json_type with "->": that will return BOOLEAN vs STRING. But that makes a lot of SQL text.
[28 May 2020 23:10] Jon Stephens
Documented fix as follows in the MySQL 8.0.21 changelog:

    In MySQL 5.7, and in MySQL 8.0 prior to 8.0.17, the server
    attempted to convert JSON boolean values to their SQL
    counterparts when testing them directly with IS TRUE, as shown
    here:

mysql> CREATE TABLE test (id INT, col JSON);

mysql> INSERT INTO test VALUES (1, '{"val":true}'), (2, '{"val":false}');

mysql> SELECT id, col, col->"$.val" FROM test WHERE col->"$.val" IS TRUE;
+------+---------------+--------------+
| id   | col           | col->"$.val" |
+------+---------------+--------------+
|    1 | {"val": true} | true         |
+------+---------------+--------------+

    As the result of work done in MySQL 8.0.17 to ensure that all
    predicates in SQL conditions are complete (that is, a condition
    of the form WHERE value is rewritten as WHERE value <> 0),
    and that a NOT IN or NOT EXISTS condition in a WHERE or ON
    clause is converted to an antijoin, evaluation of a JSON value
    in an SQL boolean context peforms an implicit comparison against
    JSON integer 0. This means that the query shown previously
    returns the following result in MySQL 8.0.17 and later:

mysql> SELECT id, col, col->"$.val" FROM test WHERE col->"$.val" IS TRUE;
+------+----------------+--------------+
| id   | col            | col->"$.val" |
+------+----------------+--------------+
|    1 | {"val": true}  | true         |
|    2 | {"val": false} | false        |
+------+----------------+--------------+

    In such cases, the server also now provides a warning:
    -Evaluating a JSON value in SQL boolean context does an implicit
    comparison against JSON integer 0; if this is not what you want,
    consider converting JSON to a SQL numeric type with JSON_VALUE
    RETURNING-. Thus, the query can be rewritten as shown here:

mysql> SELECT id, col, col->"$.val" FROM test 
    -> WHERE JSON_VALUE(col, "$.val" RETURNING UNSIGNED) IS TRUE;
+------+---------------+--------------+
| id   | col           | col->"$.val" |
+------+---------------+--------------+
|    1 | {"val": true} | true         |
+------+---------------+--------------+

Closed.