Description:
It's a common pattern to use a column for ordering, and then create composite indexes with a filtering column first and the ordering column second. But this doesn't currently work when the filtering index is a JSON multi-valued index.
How to repeat:
CREATE TABLE testing (
id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
val1 INTEGER NOT NULL,
ord INTEGER NOT NULL,
data JSON,
INDEX ((CAST(data->'$.x' AS SIGNED ARRAY)), ord),
INDEX (val1, ord)
);
INSERT INTO testing (val1, ord, data)
WITH RECURSIVE num_sequence AS (
SELECT 1 AS num
UNION ALL
SELECT num + 1 FROM num_sequence WHERE num < 1000
)
SELECT
FLOOR(1 + (RAND() * 100)),
FLOOR(1 + (RAND() * 100)),
JSON_OBJECT('x', JSON_ARRAY(CAST(FLOOR(1 + (RAND() * 100)) AS SIGNED), CAST(FLOOR(1 + (RAND() * 100)) AS SIGNED)))
FROM num_sequence;
EXPLAIN SELECT id, val1 FROM testing WHERE val1 = 35 ORDER BY ord LIMIT 10;
EXPLAIN SELECT id, (data->'$.x') FROM testing WHERE 35 MEMBER OF (data->'$.x') ORDER BY ord LIMIT 10;
As expected, the first query on val1 shows "Extra: using index". But the second query using (data->'$.x') shows "Extra: using where; using filesort".
This means that, even though the multi-valued index rows for the queried value were already sorted by ord, the engine didn't see that and re-sorted them at the end, which is potentially expensive.
Suggested fix:
The index rows are already internally sorted, the DB engine should recognize that and avoid doing a filesort.