Bug #105289 Optimizer choose index scan but not index range for query with range condition
Submitted: 21 Oct 2021 14:54 Modified: 22 Oct 2021 14:18
Reporter: Ze Yang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.7 OS:Any
Assigned to: CPU Architecture:Any

[21 Oct 2021 14:54] Ze Yang
Description:
For schema and query blow:
create table t(c1 int, c2 int, c3 int, index i_c2(c2));
select sum(c1) from t where c2 >=1 and c2 <= 3  group by c2 order by c2;

Optimizer may choose index scan on i_c2 but not index range even the query have range condition on index i_c2. This make the query performance bad. When the buffer pool can no load all the table, the performance is much worse than query with force index(i_c2) though they choose the same index.

```
EXPLAIN select sum(c1) from t where c2 >=1 and c2 <= 3  group by c2 order by c2;
id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	t	NULL	index	i_c2	i_c2	5	NULL	8291	37.05	Using where
```

The reason is:
Before test_if_skip_sort_order, optimizer choose table scan.
When test_if_skip_sort_order, test_if_cheaper_ordering choose index i_c2 as index i_c2 can skip order.

When test_quick_select as new index chosen, the force_quick_range is set. But force_quick_range in function test_quick_select just means consider the index range path even table scan cost is small. test_quick_select may not generate quick, final the optimizer generate a plan with index scan which performance is not good.

Another code error in 8.0.27 is that after set_quick, the type not changed. The access path may still use the old one.
```
2108 static bool test_if_skip_sort_order
 2445       /*
 2446         If tab->quick() pointed to another quick than save_quick, we would
 2447         lose access to it and leak memory.
 2448       */
 2449       assert(tab->quick() == save_quick || tab->quick() == nullptr);
 2450       tab->set_quick(qck);

```

How to repeat:
The query without force index choose index scan even query have range condition.
This query can more rows than query force the same index to generate range scan.

index_scan_not_range.test

```
create table t(c1 int, c2 int, c3 int, index i_c2(c2));
insert into t values(1, 1, 1), (2, 2, 2), (3, 3, 3), (4, 4, 4), (5, 5, 5), (6, 6, 6), (7, 7, 7), (8, 8,  8);

let $cnt=10;
while($cnt > 0) {
  insert into t select * from t;
  dec $cnt;
}

ANALYZE TABLE t;
EXPLAIN select sum(c1) from t where c2 >=1 and c2 <= 3  group by c2 order by c2;
EXPLAIN select sum(c1) from t force index(i_c2) where c2 >=1 and c2 <= 3  group by c2 order by c2;
DROP TABLE t;
```

index_scan_not_range.result

```
▽
create table t(c1 int, c2 int, c3 int, index i_c2(c2));
insert into t values(1, 1, 1), (2, 2, 2), (3, 3, 3), (4, 4, 4), (5, 5, 5), (6, 6, 6), (7, 7, 7), (8, 8,  8);
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
ANALYZE TABLE t;
Table Op  Msg_type  Msg_text
test.t  analyze status  OK
EXPLAIN select sum(c1) from t where c2 >=1 and c2 <= 3  group by c2 order by c2;
id  select_type table partitions  type  possible_keys key key_len ref rows  filtered  Extra
1 SIMPLE  t NULL  index i_c2  i_c2  5 NULL  8291  37.05 Using where
Warnings:
Note  1003  /* select#1 */ select sum(`test`.`t`.`c1`) AS `sum(c1)` from `test`.`t` where ((`test`.`t`.`c2` >= 1) and (`test`.`t`.`c2` <= 3)) group by `test`.`t`.`c2` order by `test`.`t`.`c2`
EXPLAIN select sum(c1) from t force index(i_c2) where c2 >=1 and c2 <= 3  group by c2 order by c2;
id  select_type table partitions  type  possible_keys key key_len ref rows  filtered  Extra
1 SIMPLE  t NULL  range i_c2  i_c2  5 NULL  3072  100.00  Using index condition
Warnings:
Note  1003  /* select#1 */ select sum(`test`.`t`.`c1`) AS `sum(c1)` from `test`.`t` FORCE INDEX (`i_c2`) where ((`test`.`t`.`c2` >= 1) and (`test`.`t`.`c2` <= 3)) group by `test`.`t`.`c2` order by `test`.`t`.`c2`DROP TABLE t;
```

Suggested fix:

The base code is 8.0.27 

diff --git a/sql/range_optimizer/range_optimizer.cc b/sql/range_optimizer/range_optimizer.cc
index 80d2f442c1d..9fe72117d10 100644
--- a/sql/range_optimizer/range_optimizer.cc
+++ b/sql/range_optimizer/range_optimizer.cc
@@ -402,8 +402,7 @@ static int fill_used_fields_bitmap(RANGE_OPT_PARAM *param,
                         for this join (so values can be assumed to be present).
                         Otherwise 0.
       limit             Query limit
-      force_quick_range Prefer to use range (instead of full table scan) even
-                        if it is more expensive.
+      force_quick_range Force consider range event the table scan cost little
       interesting_order The sort order the range access method must be able
                         to provide. Three-value logic: asc/desc/don't care
       table             The table to optimize over.
@@ -459,8 +458,8 @@ static int fill_used_fields_bitmap(RANGE_OPT_PARAM *param,
      code to pass it by reference if it doesn't.

    * In addition to force_quick_range other means can be (an usually are) used
-     to make this function prefer range over full table scan. Figure out if
-     force_quick_range is really needed.
+     to make this function consider range even table scan cost little. Figure
+     out if force_quick_range is really needed.

   RETURN
    -1 if impossible select (i.e. certainly no rows will be selected)
diff --git a/sql/sql_optimizer.cc b/sql/sql_optimizer.cc
index fd387aad2e8..951639104f9 100644
--- a/sql/sql_optimizer.cc
+++ b/sql/sql_optimizer.cc
@@ -2423,9 +2423,9 @@ static bool test_if_skip_sort_order(JOIN_TAB *tab, ORDER_with_src &order,
           0,  // empty table_map
           join->calc_found_rows ? HA_POS_ERROR
                                 : join->query_expression()->select_limit_cnt,
-          true,  // force quick range
+          false,
           order.order->direction, tab->table(), tab->skip_records_in_range(),
-          tab->condition(), &tab->needed_reg, &qck, tab->table()->force_index,
+          tab->condition(), &tab->needed_reg, &qck, true /* ignore_table_scan */,
           join->query_block);
       if (order_direction < 0 && tab->quick() != nullptr &&
           tab->quick() != save_quick) {
@@ -2448,6 +2448,14 @@ static bool test_if_skip_sort_order(JOIN_TAB *tab, ORDER_with_src &order,
       */
       assert(tab->quick() == save_quick || tab->quick() == nullptr);
       tab->set_quick(qck);
+      /*
+        Access method changed. This is after deciding join order
+        and access method for all other tables so the info
+        updated below will not have any effect on the execution
+        plan.
+      */
+      if (tab->quick())
+        tab->set_type(calc_join_type(tab->quick()->get_type()));
     }
     order_direction = best_key_direction;
     /*
[22 Oct 2021 13:55] MySQL Verification Team
Hi Mr. Yang,

Thank you for your bug report.

We were able to repeat it.

Thank you also for your patch. Although your patch is fine, there is an error in the comment. FORCE INDEX does NOT force table scan, as you wrote, but index scan. I have not found it documented in our manual.

Hence, it is quite possible that this is how it is supposed to work, in which case this bug might become a documentation bug.

Verified as reported.
[22 Oct 2021 14:18] Ze Yang
It's not a documentation bug.

The most important thing is:
There are index range conditions on the index. The optimizer choose the index but the access path is INDEX SCAN not index range. The index scan will have poor performance than index range.

When test_if_skip_sort_order and test_if_cheaper_ordering choose the index than table scan, the optimizer consider the range condition using quick records not index records.  test_if_skip_sort_order want to generate quick range.
[25 Oct 2021 12:20] MySQL Verification Team
Hi,

I agree with your comment. However, performance improvement repot can be feature request, depending on the amount of work needed. Hence, it will be decided by the team in charge of this category.