Bug #115085 INDEX using CAST (json_column) AS ARRAY drops rows containing empty array
Submitted: 22 May 2024 20:39 Modified: 23 May 2024 8:20
Reporter: Ronald Bradford Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DML Severity:S2 (Serious)
Version:8.0.34, 8.4.0 OS:Any (Docker Image)
Assigned to: CPU Architecture:x86
Tags: create index, json

[22 May 2024 20:39] Ronald Bradford
Description:
When a table has a compound index defined with a column of 'CAST(column) AS CHAR(15) ARRAY', any rows with [] in JSON are dropped from the results.

Rows that contain NULL or multi-array values return results as expected.

How to repeat:
Create Table
Load Table
Query with NULL
Query with an array with values
Query with empty array [] *PROBLEM*

CREATE TABLE t1(
  id INT UNSIGNED NOT NULL,
  fk_id INT UNSIGNED NOT NULL,
  data JSON DEFAULT NULL,
  name VARCHAR(50) NOT NULL,
  PRIMARY KEY (id),
  KEY (fk_id),
  KEY (fk_id, (CAST(data AS CHAR(15) ARRAY)))
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

Load up the table with 10,000 rows. See https://gist.github.com/ronaldbradford/493b06fcbb97acd734cb28af769be160

Null value test.

mysql> EXPLAIN SELECT * FROM t1 USE INDEX (functional_index) WHERE fk_id = 23 ORDER BY id DESC LIMIT 10;
+----+-------------+-------+------------+------+------------------+------------------+---------+-------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys    | key              | key_len | ref   | rows | filtered | Extra          |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------+------+----------+----------------+
|  1 | SIMPLE      | t1    | NULL       | ref  | functional_index | functional_index | 4       | const |   99 |   100.00 | Using filesort |
+----+-------------+-------+------------+------+------------------+------------------+---------+-------+------+----------+----------------+
1 row in set, 1 warning (0.02 sec)

mysql> SELECT * FROM t1 USE INDEX (functional_index) WHERE fk_id = 23 ORDER BY id DESC LIMIT 10;
+------+-------+------+----------------------+
| id   | fk_id | data | name                 |
+------+-------+------+----------------------+
| 9894 |    23 | NULL | G3zrgyOfdQunBFcPkhAr |
| 9864 |    23 | NULL | e6tbFNZtL4K5G4TlCs3f |
| 9731 |    23 | NULL | jnA6RLL6VKlGjGQNl1pe |
| 9728 |    23 | NULL | E3FdK88fwOToPrt5S3FK |
| 9458 |    23 | NULL | Tz67m890QQikRIsgvMwJ |
| 9435 |    23 | NULL | qNTP1B4Mtgdw33qpuaEp |
| 9144 |    23 | NULL | zZJ6IGmXoyTFI3rmeyKY |
| 9132 |    23 | NULL | mNIybmN5x9z4ySDQpiX6 |
| 9124 |    23 | NULL | 4d0NTiG95HLl6OQ0NjJu |
| 9033 |    23 | NULL | hlYrc4KH5mDlSZRLB1T3 |
+------+-------+------+----------------------+
10 rows in set (0.01 sec)

Row with array values.

mysql> INSERT INTO t1 VALUES (123456,23,'["x","y","z"]','row with array value');
Query OK, 1 row affected (0.02 sec)

mysql> SELECT * FROM t1 USE INDEX (functional_index) WHERE fk_id = 23 ORDER BY id DESC LIMIT 10;
+--------+-------+-----------------+----------------------+
| id     | fk_id | data            | name                 |
+--------+-------+-----------------+----------------------+
| 123456 |    23 | ["x", "y", "z"] | row with array value |
|   9894 |    23 | NULL            | G3zrgyOfdQunBFcPkhAr |
|   9864 |    23 | NULL            | e6tbFNZtL4K5G4TlCs3f |
|   9731 |    23 | NULL            | jnA6RLL6VKlGjGQNl1pe |
|   9728 |    23 | NULL            | E3FdK88fwOToPrt5S3FK |
|   9458 |    23 | NULL            | Tz67m890QQikRIsgvMwJ |
|   9435 |    23 | NULL            | qNTP1B4Mtgdw33qpuaEp |
|   9144 |    23 | NULL            | zZJ6IGmXoyTFI3rmeyKY |
|   9132 |    23 | NULL            | mNIybmN5x9z4ySDQpiX6 |
|   9124 |    23 | NULL            | 4d0NTiG95HLl6OQ0NjJu |
+--------+-------+-----------------+----------------------+
10 rows in set (0.01 sec)

Failing test with empty array value []

mysql> SELECT * FROM t1 USE INDEX (functional_index) WHERE fk_id = 23 ORDER BY id DESC LIMIT 10;
+--------+-------+-----------------+----------------------+
| id     | fk_id | data            | name                 |
+--------+-------+-----------------+----------------------+
| 123456 |    23 | ["x", "y", "z"] | row with array value |
|   9894 |    23 | NULL            | G3zrgyOfdQunBFcPkhAr |
|   9864 |    23 | NULL            | e6tbFNZtL4K5G4TlCs3f |
|   9731 |    23 | NULL            | jnA6RLL6VKlGjGQNl1pe |
|   9728 |    23 | NULL            | E3FdK88fwOToPrt5S3FK |
|   9458 |    23 | NULL            | Tz67m890QQikRIsgvMwJ |
|   9435 |    23 | NULL            | qNTP1B4Mtgdw33qpuaEp |
|   9144 |    23 | NULL            | zZJ6IGmXoyTFI3rmeyKY |
|   9132 |    23 | NULL            | mNIybmN5x9z4ySDQpiX6 |
|   9124 |    23 | NULL            | 4d0NTiG95HLl6OQ0NjJu |
+--------+-------+-----------------+----------------------+
10 rows in set (0.01 sec)

mysql> SELECT * FROM t1 USE INDEX (fk_id) WHERE fk_id = 23 ORDER BY id DESC LIMIT 10;
+--------+-------+-----------------+----------------------+
| id     | fk_id | data            | name                 |
+--------+-------+-----------------+----------------------+
| 123456 |    23 | ["x", "y", "z"] | row with array value |
|  99999 |    23 | []              | row with empty array |
|   9894 |    23 | NULL            | G3zrgyOfdQunBFcPkhAr |
|   9864 |    23 | NULL            | e6tbFNZtL4K5G4TlCs3f |
|   9731 |    23 | NULL            | jnA6RLL6VKlGjGQNl1pe |
|   9728 |    23 | NULL            | E3FdK88fwOToPrt5S3FK |
|   9458 |    23 | NULL            | Tz67m890QQikRIsgvMwJ |
|   9435 |    23 | NULL            | qNTP1B4Mtgdw33qpuaEp |
|   9144 |    23 | NULL            | zZJ6IGmXoyTFI3rmeyKY |
|   9132 |    23 | NULL            | mNIybmN5x9z4ySDQpiX6 |
+--------+-------+-----------------+----------------------+
10 rows in set (0.01 sec)

Suggested fix:
Index should not drop rows that match.
[22 May 2024 20:43] Ronald Bradford
Missed the magic row of data inserted.

INSERT INTO t1 VALUES (99999,23,'[]','row with empty array');
[23 May 2024 8:20] MySQL Verification Team
Hello Ronald Bradford,

Thank you for the report and test.
Verified as described.

regards,
Umesh
[27 Jun 2024 8:04] super tom
mysql> select * from t1 use index (functional_index) where fk_id=23;
+--------+-------+-----------------+----------------------+
| id     | fk_id | data            | name                 |
+--------+-------+-----------------+----------------------+
| 123456 |    23 | ["x", "y", "z"] | row with array value |
| 123456 |    23 | ["x", "y", "z"] | row with array value |
| 123456 |    23 | ["x", "y", "z"] | row with array value |
+--------+-------+-----------------+----------------------+
3 rows in set (0.00 sec)

mysql> select * from t1 where fk_id=23;
+--------+-------+-----------------+----------------------+
| id     | fk_id | data            | name                 |
+--------+-------+-----------------+----------------------+
| 123456 |    23 | ["x", "y", "z"] | row with array value |
+--------+-------+-----------------+----------------------+
[27 Jun 2024 8:11] super tom
When displaying a query using a json index, the query results are duplicated with an error result.mysql test version is 8.0.35