Bug #118202 Inconsistent Results: CASE WHEN vs Direct IS FALSE on JSON datatype
Submitted: 15 May 2:54 Modified: 21 May 3:14
Reporter: bo Lin Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: JSON Severity:S3 (Non-critical)
Version:8.0.41 OS:Any
Assigned to: MySQL Verification Team CPU Architecture:Any
Tags: casewhen, IS FALSE, json

[15 May 2:54] bo Lin
Description:
Hi, I noticed an inconsistency: when using a JSON type in a CASE WHEN clause followed by IS FALSE, the result depends on the specific values. The case when expression with JSON type ultimately returns a Longtext type, but directly doing IS FALSE to Longtext yields different results compared to first using CASE WHEN and then IS FALSE.

MySQL [test]> SELECT   data,        CASE          WHEN json_contains_path(data, 'one', '$.published')    THEN json_extract(data, '$.published')         ELSE true      END AS case_result,        CASE          WHEN json_contains_path(data, 'one', '$.published')    THEN json_extract(data, '$.published')  ELSE true      END Is false  AS is_matched  FROM course_attributes  WHERE id in( 1,2);
+----------------------+-------------+------------+
| data                 | case_result | is_matched |
+----------------------+-------------+------------+
| {"published": false} | false       |          1 |
| {"published": true}  | true        |          0 |
+----------------------+-------------+------------+
2 rows in set (0.04 sec)

"CASE WHEN json_contains_path(data, 'one', '$.published') THEN json_extract(data, '$.published') ELSE true END" will return longtext,However, the result of directly using longtext to do is false below is inconsistent with the above

MySQL [test]> select c1 is false, c2 is false from tt2;
+-------------+-------------+
| c1 is false | c2 is false |
+-------------+-------------+
|           1 |           0 |
|           1 |           0 |
+-------------+-------------+
2 rows in set, 1 warning (0.03 sec)

How to repeat:
Use the following example:

### Case 1: use case when
DROP TABLE IF EXISTS `course_attributes`;
CREATE TABLE `course_attributes`  (
  `id` int NOT NULL AUTO_INCREMENT,
  `name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL,
  `data` text CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL,
  PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 2 CHARACTER SET = utf8mb4 COLLATE = utf8mb4_0900_ai_ci ROW_FORMAT = DYNAMIC;

INSERT INTO `course_attributes` VALUES (1, '2', '{\"published\": false}');
INSERT INTO `course_attributes` VALUES (2, '3', '{\"published\": true}');

SELECT   data,     
  CASE          WHEN json_contains_path(data, 'one', '$.published') 
  THEN json_extract(data, '$.published')         ELSE true      END AS case_result,     
  CASE          WHEN json_contains_path(data, 'one', '$.published') 
  THEN json_extract(data, '$.published')         ELSE true      END IS false AS is_matched 
FROM course_attributes  WHERE id in( 1,2);

### case 2: Direct is fasle
drop table if exists tt2;
create table tt2(c1 longtext, c2 json);
insert into tt2 values('false', 'false');
insert into tt2 values('true', 'true');

select c1 is false, c2 is false from tt2;

Suggested fix:
I can't understand why the results of the above two scenarios are inconsistent. I think the judgment logic of Isfalse should be consistent for the same type in any scenario. So is this a bug? Has Mysql made any special processing for Json internally?
[19 May 21:04] MySQL Verification Team
Hi,

This is not a bug. You should properly cast your return values.

In first query none of the columns are json, you are treating data column as json with https://dev.mysql.com/doc/refman/8.0/en/json-search-functions.html#function_json-contains-... that returns 0 or 1

mysql> select json_contains_path('{"published": false}', 'one', '$.published') f, json_contains_path('{"published": true}', 'one', '$.published') t;
+------+------+
| f    | t    |
+------+------+
|    1 |    1 |
+------+------+
1 row in set (0.00 sec)

mysql> 

or in your case
mysql> select json_contains_path(data, 'one', '$.published') from course_attributes;
+------------------------------------------------+
| json_contains_path(data, 'one', '$.published') |
+------------------------------------------------+
|                                              1 |
|                                              1 |
+------------------------------------------------+
2 rows in set (0.00 sec)

so both records return 1 as path exist

so your cases always return the value for $.published and your ELSE is never used

you can also do this to make it more clear and you see that ELSE is never used:

mysql> SELECT   data,     
    ->   CASE          WHEN json_contains_path(data, 'one', '$.published') 
    ->   THEN json_extract(data, '$.published')         ELSE 'irrelevant'      END AS case_result,     
    ->   CASE          WHEN json_contains_path(data, 'one', '$.published') 
    ->   THEN json_extract(data, '$.published')         ELSE 'irrelevant'      END IS false AS is_matched 
    -> FROM course_attributes  WHERE id in( 1,2);
+----------------------+-------------+------------+
| data                 | case_result | is_matched |
+----------------------+-------------+------------+
| {"published": false} | false       |          1 |
| {"published": true}  | true        |          0 |
+----------------------+-------------+------------+
2 rows in set (0.00 sec)

where you have this distinction:

mysql> select CASE  WHEN json_contains_path(data, 'one', '$.published')  THEN json_extract(data, '$.published') ELSE 'irrelevant'  END is false from course_attributes;
+-----------------------------------------------------------------------------------------------------------------------------------+
| CASE  WHEN json_contains_path(data, 'one', '$.published')  THEN json_extract(data, '$.published') ELSE 'irrelevant'  END is false |
+-----------------------------------------------------------------------------------------------------------------------------------+
|                                                                                                                                 1 |
|                                                                                                                                 0 |
+-----------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

vs

mysql> select json_extract(data, '$.published') from course_attributes;
+-----------------------------------+
| json_extract(data, '$.published') |
+-----------------------------------+
| false                             |
| true                              |
+-----------------------------------+
2 rows in set (0.00 sec)

mysql> select json_extract(data, '$.published') is false from course_attributes;
+--------------------------------------------+
| json_extract(data, '$.published') is false |
+--------------------------------------------+
|                                          0 |
|                                          0 |
+--------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

because json_extract() returned 'false' value from json and CASE converted that to int hens in one case it is json result and in the other is not and that is why "is false" does not work as you expect it to as 'false' is not false but 0 is.

If you start your client with column info you will see that

~ % mysql -uroot --column-type-info 
...
mysql> select CASE  WHEN json_contains_path(data, 'one', '$.published')  THEN json_extract(data, '$.published') ELSE 'irrelevant'  END is false from course_attributes;
Field   1:  `CASE  WHEN json_contains_path(data, 'one', '$.published')  THEN json_extract(data, '$.published') ELSE 'irrelevant'  END is false`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     1
Max_length: 1
Decimals:   0
Flags:      NOT_NULL BINARY NUM 

+-----------------------------------------------------------------------------------------------------------------------------------+
| CASE  WHEN json_contains_path(data, 'one', '$.published')  THEN json_extract(data, '$.published') ELSE 'irrelevant'  END is false |
+-----------------------------------------------------------------------------------------------------------------------------------+
|                                                                                                                                 1 |
|                                                                                                                                 0 |
+-----------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

mysql> select json_extract(data, '$.published')  is false from course_attributes;
Field   1:  `json_extract(data, '$.published')  is false`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     1
Max_length: 1
Decimals:   0
Flags:      NOT_NULL BINARY NUM 

+---------------------------------------------+
| json_extract(data, '$.published')  is false |
+---------------------------------------------+
|                                           0 |
|                                           0 |
+---------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

mysql> select json_extract(data, '$.published') from course_attributes;
Field   1:  `json_extract(data, '$.published')`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       JSON
Collation:  utf8mb4_0900_ai_ci (255)
Length:     4294967292
Max_length: 5
Decimals:   31
Flags:      BINARY 

+-----------------------------------+
| json_extract(data, '$.published') |
+-----------------------------------+
| false                             |
| true                              |
+-----------------------------------+
2 rows in set (0.00 sec)

mysql> 

mysql> select c1 is false, c2 is false from tt2;
Field   1:  `c1 is false`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     1
Max_length: 0
Decimals:   0
Flags:      NOT_NULL BINARY NUM 

Field   2:  `c2 is false`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     1
Max_length: 0
Decimals:   0
Flags:      NOT_NULL BINARY NUM 

0 rows in set, 1 warning (0.00 sec)

mysql>
[19 May 21:33] MySQL Verification Team
And please check warnings too

+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                                                                                   |
+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 3986 | Evaluating a JSON value in SQL boolean context does an implicit comparison against JSON integer 0; if this is not what you want, consider converting JSON to a SQL numeric type with JSON_VALUE RETURNING |
+---------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
[21 May 3:14] bo Lin
Hi, thanks for your replay. However, I still have some questions.

According to the official documentation at https://dev.mysql.com/doc/refman/9.2/en/flow-control-functions.html#operator_case, the return type of the CASE expression in the example should be VARCHAR. The documentation states: "For all other type combinations, the result is VARCHAR."

The following queries also confirm this:

MySQL [test]> select CASE          WHEN json_contains_path(data, 'one', '$.published') THEN json_extract(data, '$.published')         ELSE true      END AS case_result from course_attributes;
Field   1:  `case_result`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       VAR_STRING
Collation:  latin1_swedish_ci (8)
Length:     1073741823
Max_length: 5
Decimals:   31
Flags:      BINARY 

+-------------+
| case_result |
+-------------+
| false       |
| true        |
+-------------+
2 rows in set (0.04 sec)

or 

MySQL [test]> create table tt1 as select CASE WHEN json_contains_path(data, 'one', '$.published') THEN json_extract(data, '$.published') ELSE true END AS case_result from course_attributes;
Query OK, 2 rows affected (0.05 sec)
Records: 2 Duplicates: 0 Warnings: 0

MySQL [test]> desc tt1;
+-------------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+----------+------+-----+---------+-------+
| case_result | longtext | YES | | NULL | |
+-------------+----------+------+-----+---------+-------+
1 row in set (0.04 sec)

Given this, the CASE expression should convert the JSON 'false' returned by json_extract to a VARCHAR during output. However, your previous response mentioned that the CASE expression converts the JSON 'false' to an int (do you mean json '0').

Thus, my questions are:

1. Why and when does the CASE expression perform this conversion to int?
    a. Does this conversion happen when only using CASE expression? It appears not, as the standalone CASE query returns 'false'.
    b. Or does the conversion happen when doing IS FALSE after the CASE expression? The conversion maybe like: JSON 'false' → JSON '0' → VARCHAR '0', and then IS FALSE evaluates it as VARCHAR '0'?

2. If the CASE result is VARCHAR, why does IS FALSE behave inconsistently with direct VARCHAR comparisons? For example, in the column c1 of table tt2 example.