MySQL Bugs: #108116: SQL queries with range of Item

Bug #108116	SQL queries with range of Item_row are not optimized
Submitted:	11 Aug 2022 8:21	Modified:	11 Aug 2022 9:21
Reporter:	Brian Yue (OCA)	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	8.0.29, 8.0.30, 5.7.39	OS:	Any (rhel-7.4)
Assigned to:		CPU Architecture:	Any (x86-64)
Tags:	Contribution

Description:
Hello, 
  Recently I find a case that if where_clause of a query has a range on Item_row, it's not optimized, index access type is used and lead to bad performance. For this case, range access type should be a better choice.
  Please reference to `How to repeat` for details.

How to repeat:
(1) Firstly, let's prepare some data (the following SQL statements can be executed directly)

create database mytest;
use mytest
create table t1 (id int primary key, c1 int, c2 int, key k_c (c1,c2));
insert into t1 values (1,1,1), (2,2,2), (3,3,3), (4,4,4);
insert into t1 select id + 4, c1 +4, c2 + 4 from t1;
insert into t1 select id + 8, c1 + 8, c2 + 8 from t1;
insert into t1 select id + 16, c1 + 16, c2 + 16 from t1;
insert into t1 select id + 32, c1 + 32, c2 + 32 from t1;
insert into t1 select id + 64, c1 + 64, c2 + 64 from t1;
insert into t1 select id + 128, c1 + 128, c2 + 128 from t1;
insert into t1 select id + 256, c1 + 256, c2 + 256 from t1;
insert into t1 select id + 512, c1 + 512, c2 + 512 from t1;
insert into t1 select id + 1024, c1 + 1024, c2 + 1024 from t1;
insert into t1 select id + 2048, c1 + 2048, c2 + 2048 from t1;
insert into t1 select id + 4096, c1 + 4096, c2 + 4096 from t1;
insert into t1 select id + 8192, c1 + 8192, c2 + 8192 from t1;

(2) Let's do some test

As is said in refman ( https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_greater-than-or... ):
For row comparisons, (a, b) > (x, y) is equivalent to:
(a > x) OR ((a = x) AND (b > y))

so the two statements is equivalent:
(a) select * from t1 where (c1,c2) < (2,2);
(b) select * from t1 where c1 < 2 OR (c1 = 2 and c2 < 2);

But they show different execution plan and performance:

(3) test the statement (a)
Statement (a) uses range access type and 2 rows are estimated to be accessed, and it costs 0.00 sec :

mysql> explain select * from t1 where c1 < 2 OR (c1 = 2 and c2 < 2) \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: range
possible_keys: k_c
          key: k_c
      key_len: 10
          ref: NULL
         rows: 2
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

mysql> select * from t1 where c1 < 2 OR (c1 = 2 and c2 < 2) ;
+----+------+------+
| id | c1   | c2   |
+----+------+------+
|  1 |    1 |    1 |
+----+------+------+
1 row in set (0.00 sec)

(3) test the statement (b)
Statement (b) uses range index type and 16308 rows are estimated to be accessed, and it costs 0.13 sec :

mysql> explain select * from t1 where (c1,c2) < (2,2) \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: index
possible_keys: NULL
          key: k_c
      key_len: 10
          ref: NULL
         rows: 16308
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.01 sec)

mysql> select * from t1 where (c1,c2) < (2,2);
+----+------+------+
| id | c1   | c2   |
+----+------+------+
|  1 |    1 |    1 |
+----+------+------+
1 row in set (0.13 sec)

So, there is a big gap betweent the performance of the 2 equivalent statements.
 

Suggested fix:
Optimize the query `select * from t1 where (c1,c2) < (2,2);`, range access type should be used.

Hello Brian Yue,

Thank you for the report and feedback.

regards,
Umesh

Contribute by Tencent : support row value decomposition for comparison (based on 8.4.2) (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 0001-support-row-value-decomposition-for-comparison.patch (application/octet-stream, text), 34.54 KiB.

Dear experts,

I hope this message finds you well.

As the contribution patch above, I have developed this patch to addresses this issue and I believe it could be beneficial for the wider community. The patch ensures that row values can be decomposed efficiently and effectively, enhancing the overall optimization and execution.

I have taken care to ensure that the patch adheres to the project's coding standards and conventions, and I have thoroughly documented the problem and the solution in the patch notes.

I would highly appreciate it if you could consider my patch for inclusion in the official version. I am open to any feedback or suggestions you may have to improve it further. I am committed to contributing to this project and would be more than willing to make any necessary revisions based on your guidance.

Thank you for your time and consideration. I look forward to hearing from you soon.

Best regards,

Tianfeng Li, From Tencent Company.

Fix a bug of the above patch: consider NULL in select_list (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 0001-row-comparison-decomposition-for-SelectList-should-n.patch (application/octet-stream, text), 7.97 KiB.

Hi, it happened again as BUG#111952