Bug #115359 Ordering not working with multi-valued composite index
Submitted: 17 Jun 2024 12:26 Modified: 18 Jun 2024 16:16
Reporter: Dishant Trivedi Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S5 (Performance)
Version:8.0.26 OS:Any
Assigned to: CPU Architecture:Any
Tags: Multi-Valued Indexes, ordering

[17 Jun 2024 12:26] Dishant Trivedi
Description:
I am using a composite multi-valued index. I have a field `json_arr_with_ints` containing 

The index is defined in this format: (`id`,(cast(`json_arr_with_ints` as unsigned array)),`cursor` DESC). id is an integer, json_arr_with_ints is a JSON array that contains only integer elements, cursor is also of integer type!

Query: SELECT * FROM table WHERE id = :id AND :val MEMBER OF (json_arr_with_ints) AND cursor < :cursor ORDER BY cursor DESC LIMIT 10

Initial setup:
I have an initial setup with 150k rows. Most of the rows (145k) contains "5" as member of the `json_arr_with_ints` array!

Explain Analyze statement of this query: 
*************************** 1. row ***************************
EXPLAIN: -> Limit: 100 row(s)  (cost=33788.96 rows=100) (actual time=1279.222..1279.266 rows=100 loops=1)
    -> Sort: table.cursor DESC, limit input to 100 row(s) per chunk  (cost=33788.96 rows=75086) (actual time=1279.220..1279.257 rows=100 loops=1)
        -> Filter: ((table.id = 765502) and json'5' member of (cast(json_arr_with_ints as unsigned array)) and (table.cursor < 5550222403000))  (cost=33788.96 rows=75086) (actual time=0.096..1183.740 rows=145916 loops=1)
            -> Index range scan on table using id_json_cursor_key  (cost=33788.96 rows=75086) (actual time=0.093..1093.371 rows=157516 loops=1)

Problem:
As can be seen from the explain analyze output, the index range scan is not working on the cursor field! I want to know the reason why! Also, I am completely fine with it not working, but why does the index scan return such high number of rows? Also can see that sorting is happening after the index range scan & filter step. Is this expected? Due to this, query time is almost 1.2-1.3s! 

Why is the `cursor` part of the index not getting used in the query and in the index scan? Am I missing something? Is there a better way to serve such queries?

How to repeat:
Already explained in the Description section.
[17 Jun 2024 13:26] MySQL Verification Team
HI Mr Trivedi,

Thank you for your bug report.

However, this is not a bug. This is expected behaviour.

Composite index can be used for searching or ordering (sorting), only if the column is the first element of the composite key. That is how B+-trees work.

Not a bug.
[17 Jun 2024 13:44] Dishant Trivedi
Hi,
Thanks for the reply! 

I'm afraid that is not the case :)

I'm doing a equals (=) condition on the prefix of the key and then doing ordering on the suffix! This should use index range scan right? I can share examples of multiple tables where I use such index and queries. Happy to share if needed!
[17 Jun 2024 14:23] MySQL Verification Team
Hi Mr. Trivedi,

Sorry, but this does not change the facts.

Simply, these are two completely different operations and each one is done totally separately of each other.

That is the only known algorithm of resolving queries like yours ......
[17 Jun 2024 16:41] Dishant Trivedi
Can you please share a reference doc for this?
[17 Jun 2024 17:35] Dishant Trivedi
https://dev.mysql.com/doc/refman/8.4/en/order-by-optimization.html

This document highlights one example:

In this query, key_part1 is constant, so all rows accessed through the index are in key_part2 order, and an index on (key_part1, key_part2) avoids sorting if the WHERE clause is selective enough to make an index range scan cheaper than a table scan:

SELECT * FROM t1
  WHERE key_part1 = constant
  ORDER BY key_part2;
[18 Jun 2024 10:05] MySQL Verification Team
Hi,

Please read our Reference Manual , found in https://dev.mysql.com.
[18 Jun 2024 15:15] Dishant Trivedi
All I'm trying to understand is that here also we know that the rows the query will access will be in remaining part of index only because of = check on id

So why does the same optimisation not apply? 

Also the link you provided is a homepage link to the MySQL docs! Can you pls be more specific and point out the issue I am facing!
[18 Jun 2024 15:32] MySQL Verification Team
Hi Mr. Trivedi,

This is a forum for the bug reporting.

You are supposed to read yourself all the relevant parts of the Reference Manual.

You can not apply optimisation when the operations are totally distinct and are in totally different phases of the query optimisation and execution.

Not a bug.
[18 Jun 2024 16:16] Dishant Trivedi
Hello,

I am neither denying that this is a bug reporting forum nor am I trying to prove that this is a bug! I may very well be wrong and I accept that!!

But if you are making a claim saying this is the expected behavior, you should at the very least try to explain the reasoning for your claim OR (more importantly) share a reference doc backing your claim! Moreover, I have linked a reference to your own documentation that states otherwise - you can atleast take some effort and explain me where I am going wrong!

If this is not the right place to ask such question, redirect me to the place where I should be asking this! Just beating around the bush and repeatedly writing "NOT A BUG" won't solve the problem for me :)
[19 Jun 2024 9:55] MySQL Verification Team
Hi Mr. Trivedi,

We wish you to solve your problem.

Hence, we are hereby supplying you with necessary info.

For details on getting support for MySQL products see http://www.mysql.com/support/
You can also check our forums (free) at http://forums.mysql.com/

Thank you for your interest in MySQL.