Bug #106621 Multivalued index creation results in partial output.
Submitted: 2 Mar 2022 18:26 Modified: 2 Jul 2022 14:05
Reporter: Владислав Сокол Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:8.0.27 OS:Any
Assigned to: CPU Architecture:Any
Tags: json, multivalued_index, partial_output

[2 Mar 2022 18:26] Владислав Сокол
Description:
While using MEMBER OF() JSON_CONTAINS() and JSON_OVERLAPS() over JSON column which contains an array of integers a lot of rows returns.

After multi-valued index creation only rows from the beginning of the table are returned, the rows which posesses after some definite row are not returned.

Observed on 8.0.27 and 8.0.28. Not observed on 8.0.23.

How to repeat:
-- set needed setting
SET @@cte_max_recursion_depth := 10000;

DROP TABLE IF EXISTS test;

-- create a table
CREATE TABLE test (id INT, json_value JSON);

-- fill the table with 10000 rows
-- an array of 10 elements with pseudo-random values in each row
INSERT INTO test (id, json_value)
WITH RECURSIVE cte AS (
    SELECT 1 id
	UNION ALL
	SELECT 1 + id FROM cte WHERE id < @@cte_max_recursion_depth
)
SELECT id, 
       JSON_ARRAY(CAST(@seed := (@seed * 73127 + 36671) % 99053 AS UNSIGNED),
                  CAST(@seed := (@seed * 73127 + 36671) % 99053 AS UNSIGNED),
                  CAST(@seed := (@seed * 73127 + 36671) % 99053 AS UNSIGNED),
                  CAST(@seed := (@seed * 73127 + 36671) % 99053 AS UNSIGNED),
                  CAST(@seed := (@seed * 73127 + 36671) % 99053 AS UNSIGNED),
                  CAST(@seed := (@seed * 73127 + 36671) % 99053 AS UNSIGNED),
                  CAST(@seed := (@seed * 73127 + 36671) % 99053 AS UNSIGNED),
                  CAST(@seed := (@seed * 73127 + 36671) % 99053 AS UNSIGNED),
                  CAST(@seed := (@seed * 73127 + 36671) % 99053 AS UNSIGNED),
                  CAST(@seed := (@seed * 73127 + 36671) % 99053 AS UNSIGNED))
FROM cte
WHERE (@seed := 0) = 0
ORDER BY id;

-- view first 20 rows
SELECT id, CAST(json_value AS CHAR) json_value FROM test ORDER BY id LIMIT 20;

-- set a variable to a value found in the row with id=10
SET @random_value := 37947;

-- select id of rows in which this value is present
-- 5 rows will be found, with id values of 10, 2486, 4962, 7438, 9915
SELECT id FROM test WHERE @random_value MEMBER OF (json_value->'$');

-- create multivalued index
ALTER TABLE test 
   ADD INDEX multivalued_index ((CAST(json_value->'$' AS UNSIGNED ARRAY)));
   
-- execute the same query - only one row (id=10) is returned
SELECT id FROM test WHERE @random_value MEMBER OF (json_value->'$');

-- look at the rows 1311 and 1312 (the values are found experimentally)
SELECT id, CAST(json_value AS CHAR) json_value FROM test WHERE id IN (1311, 1312);
-- search for the value from the row id=1311 - success
SELECT id FROM test WHERE 48176 MEMBER OF (json_value->'$');
-- search for the value from the row id=1312 - fail
SELECT id FROM test WHERE 74452 MEMBER OF (json_value->'$');

Suggested fix:
downgrade?
[3 Mar 2022 5:15] MySQL Verification Team
Hello Владислав Сокол!

Thank you for the report and feedback.

regards,
Umesh
[27 Jun 2022 19:02] Tech TwentyOne
This is a pretty significant bug. Queries that should return data are returning 0 rows without error or warning!
[2 Jul 2022 14:05] Daniel Price
Posted by developer:
 
Fixed as of the upcoming 8.0.30 release:

A query returned only a partial result set after creating multi-valued
index on a JSON column.