Bug #108116 SQL queries with range of Item_row are not optimized
Submitted: 11 Aug 2022 8:21 Modified: 11 Aug 2022 9:21
Reporter: Brian Yue (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.29, 8.0.30, 5.7.39 OS:Any (rhel-7.4)
Assigned to: CPU Architecture:Any (x86-64)
Tags: Contribution

[11 Aug 2022 8:21] Brian Yue
Description:
Hello, 
  Recently I find a case that if where_clause of a query has a range on Item_row, it's not optimized, index access type is used and lead to bad performance. For this case, range access type should be a better choice.
  Please reference to `How to repeat` for details.

How to repeat:
(1) Firstly, let's prepare some data (the following SQL statements can be executed directly)

create database mytest;
use mytest
create table t1 (id int primary key, c1 int, c2 int, key k_c (c1,c2));
insert into t1 values (1,1,1), (2,2,2), (3,3,3), (4,4,4);
insert into t1 select id + 4, c1 +4, c2 + 4 from t1;
insert into t1 select id + 8, c1 + 8, c2 + 8 from t1;
insert into t1 select id + 16, c1 + 16, c2 + 16 from t1;
insert into t1 select id + 32, c1 + 32, c2 + 32 from t1;
insert into t1 select id + 64, c1 + 64, c2 + 64 from t1;
insert into t1 select id + 128, c1 + 128, c2 + 128 from t1;
insert into t1 select id + 256, c1 + 256, c2 + 256 from t1;
insert into t1 select id + 512, c1 + 512, c2 + 512 from t1;
insert into t1 select id + 1024, c1 + 1024, c2 + 1024 from t1;
insert into t1 select id + 2048, c1 + 2048, c2 + 2048 from t1;
insert into t1 select id + 4096, c1 + 4096, c2 + 4096 from t1;
insert into t1 select id + 8192, c1 + 8192, c2 + 8192 from t1;

(2) Let's do some test

As is said in refman ( https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_greater-than-or... ):
For row comparisons, (a, b) > (x, y) is equivalent to:
(a > x) OR ((a = x) AND (b > y))

so the two statements is equivalent:
(a) select * from t1 where (c1,c2) < (2,2);
(b) select * from t1 where c1 < 2 OR (c1 = 2 and c2 < 2);

But they show different execution plan and performance:

(3) test the statement (a)
Statement (a) uses range access type and 2 rows are estimated to be accessed, and it costs 0.00 sec :

mysql> explain select * from t1 where c1 < 2 OR (c1 = 2 and c2 < 2) \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: range
possible_keys: k_c
          key: k_c
      key_len: 10
          ref: NULL
         rows: 2
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

mysql> select * from t1 where c1 < 2 OR (c1 = 2 and c2 < 2) ;
+----+------+------+
| id | c1   | c2   |
+----+------+------+
|  1 |    1 |    1 |
+----+------+------+
1 row in set (0.00 sec)

(3) test the statement (b)
Statement (b) uses range index type and 16308 rows are estimated to be accessed, and it costs 0.13 sec :

mysql> explain select * from t1 where (c1,c2) < (2,2) \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: index
possible_keys: NULL
          key: k_c
      key_len: 10
          ref: NULL
         rows: 16308
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.01 sec)

mysql> select * from t1 where (c1,c2) < (2,2);
+----+------+------+
| id | c1   | c2   |
+----+------+------+
|  1 |    1 |    1 |
+----+------+------+
1 row in set (0.13 sec)

So, there is a big gap betweent the performance of the 2 equivalent statements.
 

Suggested fix:
Optimize the query `select * from t1 where (c1,c2) < (2,2);`, range access type should be used.
[11 Aug 2022 9:21] MySQL Verification Team
Hello Brian Yue,

Thank you for the report and feedback.

regards,
Umesh
[26 Aug 2024 10:47] tianfeng li
Contribute by Tencent : support row value decomposition for comparison (based on 8.4.2)

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 0001-support-row-value-decomposition-for-comparison.patch (application/octet-stream, text), 34.54 KiB.

[26 Aug 2024 10:56] tianfeng li
Dear experts,

I hope this message finds you well.

As the contribution patch above, I have developed this patch to addresses this issue and I believe it could be beneficial for the wider community. The patch ensures that row values can be decomposed efficiently and effectively, enhancing the overall optimization and execution.

I have taken care to ensure that the patch adheres to the project's coding standards and conventions, and I have thoroughly documented the problem and the solution in the patch notes.

I would highly appreciate it if you could consider my patch for inclusion in the official version. I am open to any feedback or suggestions you may have to improve it further. I am committed to contributing to this project and would be more than willing to make any necessary revisions based on your guidance.

Thank you for your time and consideration. I look forward to hearing from you soon.

Best regards,

Tianfeng Li, From Tencent Company.
[5 Sep 2024 3:40] tianfeng li
Fix a bug of the above patch: consider NULL in select_list

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 0001-row-comparison-decomposition-for-SelectList-should-n.patch (application/octet-stream, text), 7.97 KiB.

[30 Oct 2024 6:13] tianfeng li
Hi, it happened again as BUG#111952