Bug #119742 Incorrect query results when using multi-valued index on GENERATED JSON column
Submitted: 22 Jan 7:47
Reporter: M Jay Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: JSON Severity:S4 (Feature request)
Version:8.0.41 OS:Any
Assigned to: CPU Architecture:Any

[22 Jan 7:47] M Jay
Description:
When using a MEMBER OF (or JSON_CONTAINS etc.) with a multi-valued index on GENERATED JSON column, the query results are incorrect.

How to repeat:
The column and index are defined as follows:
`pool_info` json DEFAULT NULL COMMENT '资源池信息',
`pool_array` json GENERATED ALWAYS AS (`pool_info`) VIRTUAL,
...
KEY `idx_pool_array` ((cast(`pool_array` as unsigned array)))

The query statement and results are as follows:
mysql> SELECT COUNT(*) FROM job WHERE 63 MEMBER OF(pool_array);
+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+
mysql> explain SELECT COUNT(*) FROM job WHERE 63 MEMBER OF(pool_array);
+----+-------------+--------------+------------+------+----------------+----------------+---------+-------+------+----------+-------------+
| id | select_type | table        | partitions | type | possible_keys  | key            | key_len | ref   | rows | filtered | Extra       |
+----+-------------+--------------+------------+------+----------------+----------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | job | NULL       | ref  | idx_pool_array | idx_pool_array | 9       | const |    1 |   100.00 | Using where |
+----+-------------+--------------+------------+------+----------------+----------------+---------+-------+------+----------+-------------+

The query result above is incorrect. When forcing the query to not use the index, the correct results are as follows:
mysql> SELECT /*+ NO_INDEX(job idx_pool_array) */ COUNT(*) FROMjob  WHERE 63 MEMBER OF(pool_array);
+----------+
| COUNT(*) |
+----------+
|   422091 |
+----------+

more sqls:
mysql> SELECT COUNT(*) FROM job  WHERE 63 MEMBER OF(pool_array);
+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+

mysql> SELECT COUNT(*) FROM job  WHERE 63 MEMBER OF(pool_array) and id=220331;
+----------+
| COUNT(*) |
+----------+
|        1 |
+----------+

This problem may be related to the data volume and data distribution. Here is the data from my JSON column:
mysql> select distinct(pool_info),count(*) as num from job group by pool_info order by num desc limit 100;
+--------------------+--------+
| resource_pool_info | num    |
+--------------------+--------+
| [63]               | 201272 |
| [20, 23]           | 144586 |
| [63, 68]           | 125281 |
| [68]               | 124019 |
| [63, 68, 79, 80]   | 106064 |
| [23]               |  54553 |
| [71, 74]           |  14489 |
| [20]               |  12543 |
| [58]               |   4360 |
| [63, 68, 79]       |    495 |
| [66]               |    103 |
| [80]               |     57 |
| [79, 80]           |     28 |
| [79]               |     13 |
| [63, 67]           |     12 |
| [71]               |     10 |
| [75]               |     10 |
| [74]               |      6 |
| [34]               |      4 |
| [20, 23, 34]       |      3 |
| [76]               |      1 |
| [56, 63]           |      1 |
| [52, 72]           |      1 |
| [81]               |      1 |
| [82]               |      1 |
+--------------------+--------+