Bug #102193 Inaccurate execution plan selection
Submitted: 8 Jan 2021 12:21 Modified: 12 Jan 2021 8:14
Reporter: qichao zhou Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.22 OS:CentOS (CentOS Linux release 8.0.1905 (Core))
Assigned to: CPU Architecture:x86 (4.18.0-80.el8.x86_64 )

[8 Jan 2021 12:21] qichao zhou
Description:
  The cost value of that the execution plan for full index scan and the execution plan need return from table is close.The second scenario seems to ignore the cost of retrieving data from the clustered index.

mysql> show create table  zhouqch.test\G
*************************** 1. row ***************************
       Table: test
Create Table: CREATE TABLE `test` (
  `id_bank` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL COMMENT '本机机构号',
  `code_bank` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL COMMENT '本机实际IFAR机构号',
  `name_bank` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '机构名称',
  `ename_bank` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '机构简称',
  `lvl_bank_cd` varchar(2) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '机构级别代码',
  `seqno_bank` varchar(4) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '机构序号',
  `stat_bank_cd` varchar(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '机构状态代码',
  `up_code_bank` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '上级机构号',
  `sum_ind` varchar(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '汇总标识',
  `enabe_dt` varchar(8) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '启用日期',
  `enable_ind` varchar(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '有效标识',
  `id_bank_0` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '最上层机构号',
  `name_bank_0` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '最上层机构名称',
  `id_bank_1` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第二层机构号',
  `name_bank_1` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第二层机构名称',
  `id_bank_2` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第三层机构号',
  `name_bank_2` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第三层机构名称',
  `id_bank_3` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第四层机构号',
  `name_bank_3` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第四层机构名称',
  `id_bank_4` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第五层机构号',
  `name_bank_4` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第五层机构名称',
  `id_bank_5` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第六层机构号',
  `name_bank_5` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第六层机构名称',
  `id_bank_6` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第七层机构号',
  `name_bank_6` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第七层机构名称',
  `id_bank_7` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第八层机构号',
  `name_bank_7` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第八层机构名称',
  `startdt` date DEFAULT NULL COMMENT '开始日期',
  `validdate` date DEFAULT NULL COMMENT '有效日期',
  `enddt` date DEFAULT NULL COMMENT '结束日期',
  `timestamp` datetime DEFAULT NULL COMMENT '处理的时间戳',
  PRIMARY KEY (`id_bank`),
  KEY `idx_org_tree_id_bank` (`id_bank`) USING BTREE,
  KEY `idx_org_tree_id_bank_3` (`id_bank_3`) USING BTREE,
  KEY `idx_org_tree_id_bank_2` (`id_bank_2`) USING BTREE,
  KEY `idx_org_tree_id_bank_1` (`id_bank_1`) USING BTREE,
  KEY `idx_org_tree_id_bank_4` (`id_bank_4`) USING BTREE,
  KEY `idx_org_tree_id_bank_5` (`id_bank_5`) USING BTREE,
  KEY `idx_org_tree_id_bank_6` (`id_bank_6`) USING BTREE,
  KEY `idx_org_tree_id_bank_7` (`id_bank_7`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci STATS_SAMPLE_PAGES=10000 ROW_FORMAT=DYNAMIC
1 row in set (0.01 sec)

mysql> 

mysql> explain analyze select * from zhouqch.test where id_bank_3='PCT60';
+------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                              |
+------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Index lookup on test using idx_org_tree_id_bank_3 (id_bank_3='PCT60')  (cost=2461.55 rows=23408) (actual time=0.187..147.081 rows=28090 loops=1)
 |
+------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.15 sec)

mysql> explain analyze select id_bank_3 from zhouqch.test where id_bank_3='PCT60';
+----------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                            |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Index lookup on test using idx_org_tree_id_bank_3 (id_bank_3='PCT60')  (cost=2447.45 rows=23408) (actual time=0.055..7.047 rows=28090 loops=1)
 |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

How to repeat:
mysql> show create table  zhouqch.test\G
*************************** 1. row ***************************
       Table: test
Create Table: CREATE TABLE `test` (
  `id_bank` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL COMMENT '本机机构号',
  `code_bank` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL COMMENT '本机实际IFAR机构号',
  `name_bank` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '机构名称',
  `ename_bank` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '机构简称',
  `lvl_bank_cd` varchar(2) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '机构级别代码',
  `seqno_bank` varchar(4) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '机构序号',
  `stat_bank_cd` varchar(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '机构状态代码',
  `up_code_bank` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '上级机构号',
  `sum_ind` varchar(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '汇总标识',
  `enabe_dt` varchar(8) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '启用日期',
  `enable_ind` varchar(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '有效标识',
  `id_bank_0` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '最上层机构号',
  `name_bank_0` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '最上层机构名称',
  `id_bank_1` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第二层机构号',
  `name_bank_1` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第二层机构名称',
  `id_bank_2` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第三层机构号',
  `name_bank_2` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第三层机构名称',
  `id_bank_3` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第四层机构号',
  `name_bank_3` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第四层机构名称',
  `id_bank_4` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第五层机构号',
  `name_bank_4` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第五层机构名称',
  `id_bank_5` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第六层机构号',
  `name_bank_5` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第六层机构名称',
  `id_bank_6` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第七层机构号',
  `name_bank_6` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第七层机构名称',
  `id_bank_7` varchar(18) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第八层机构号',
  `name_bank_7` varchar(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '第八层机构名称',
  `startdt` date DEFAULT NULL COMMENT '开始日期',
  `validdate` date DEFAULT NULL COMMENT '有效日期',
  `enddt` date DEFAULT NULL COMMENT '结束日期',
  `timestamp` datetime DEFAULT NULL COMMENT '处理的时间戳',
  PRIMARY KEY (`id_bank`),
  KEY `idx_org_tree_id_bank` (`id_bank`) USING BTREE,
  KEY `idx_org_tree_id_bank_3` (`id_bank_3`) USING BTREE,
  KEY `idx_org_tree_id_bank_2` (`id_bank_2`) USING BTREE,
  KEY `idx_org_tree_id_bank_1` (`id_bank_1`) USING BTREE,
  KEY `idx_org_tree_id_bank_4` (`id_bank_4`) USING BTREE,
  KEY `idx_org_tree_id_bank_5` (`id_bank_5`) USING BTREE,
  KEY `idx_org_tree_id_bank_6` (`id_bank_6`) USING BTREE,
  KEY `idx_org_tree_id_bank_7` (`id_bank_7`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci STATS_SAMPLE_PAGES=10000 ROW_FORMAT=DYNAMIC
1 row in set (0.01 sec)

mysql> 

mysql> select id_bank_3,count(*) from zhouqch.test group by id_bank_3;
+-----------+----------+
| id_bank_3 | count(*) |
+-----------+----------+
| 0001265b  |    18718 |
| 04H414    |        8 |
| PCT60     |    28090 |
+-----------+----------+
3 rows in set (0.01 sec)

mysql> explain select * from zhouqch.test where id_bank_3='PCT60';
+----+-------------+-------+------------+------+------------------------+------------------------+---------+-------+-------+----------+-------+
| id | select_type | table | partitions | type | possible_keys          | key                    | key_len | ref   | rows  | filtered | Extra |
+----+-------------+-------+------------+------+------------------------+------------------------+---------+-------+-------+----------+-------+
|  1 | SIMPLE      | test  | NULL       | ref  | idx_org_tree_id_bank_3 | idx_org_tree_id_bank_3 | 75      | const | 23408 |   100.00 | NULL  |
+----+-------------+-------+------------+------+------------------------+------------------------+---------+-------+-------+----------+-------+
1 row in set, 1 warning (0.00 sec)

mysql> 
mysql> explain select id_bank_3 from zhouqch.test where id_bank_3='PCT60';
+----+-------------+-------+------------+------+------------------------+------------------------+---------+-------+-------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys          | key                    | key_len | ref   | rows  | filtered | Extra       |
+----+-------------+-------+------------+------+------------------------+------------------------+---------+-------+-------+----------+-------------+
|  1 | SIMPLE      | test  | NULL       | ref  | idx_org_tree_id_bank_3 | idx_org_tree_id_bank_3 | 75      | const | 23408 |   100.00 | Using index |
+----+-------------+-------+------------+------+------------------------+------------------------+---------+-------+-------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> 

mysql> explain analyze select * from zhouqch.test where id_bank_3='PCT60';
+------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                              |
+------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Index lookup on test using idx_org_tree_id_bank_3 (id_bank_3='PCT60')  (cost=2461.55 rows=23408) (actual time=0.187..147.081 rows=28090 loops=1)
 |
+------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.15 sec)

mysql> explain analyze select id_bank_3 from zhouqch.test where id_bank_3='PCT60';
+----------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                            |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Index lookup on test using idx_org_tree_id_bank_3 (id_bank_3='PCT60')  (cost=2447.45 rows=23408) (actual time=0.055..7.047 rows=28090 loops=1)
 |
+----------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

Suggested fix:
  For the second scenario, the cost of retrieving the cluster index is added to the estimate of the execution plan, so that the correct execution plan is selected when most of the data in the table is scanned.
[11 Jan 2021 15:20] MySQL Verification Team
Hi Mr. zhou,

Thank you for your bug report.

However, we do not think that this is a bug.

Simply, as it is shown by your EXPLAIN ANALYZE, the cost for lookup of the values in the index, without accessing a row itself, is lower ...... Also, normal EXPLAIN clearly distinguishes between the two plan with the added comment of "Using index", which means that rows were not accessed at all !!!!!!

Not a bug.
[12 Jan 2021 8:14] qichao zhou
Hi,

I think, the cost for lookup of the values in the index which without accessing a row itself is lower but not enough。

Let's go to the other extreme Scenario. The PCT90 accounted for 99.9% in ID_BANK_3 field,but the execution plan still chooses to use indexes.

Let's go over the EXPLAIN ANALYZE for SELECT * FROM ZHOUQCH.TEST1 WHERE ID_BANK_3='PCT90', the ACTUAL TIME is 0.154,which is not accurate. The COST is 2461,it's only half as much as if the scenario didn't use the index. There seems to be something wrong with it.

mysql> select id_bank_3,count(*) from test1 group by id_bank_3;
+-----------+----------+
| id_bank_3 | count(*) |
+-----------+----------+
| 04H414    |        8 |
| PCT90     |    46808 |
+-----------+----------+
2 rows in set (0.01 sec)

mysql> select 46808/46816;
+-------------+
| 46808/46816 |
+-------------+
|      0.9998 |
+-------------+
1 row in set (0.00 sec)

mysql> explain select * from zhouqch.test1 where id_bank_3='PCT90';
+----+-------------+-------+------------+------+------------------------+------------------------+---------+-------+-------+----------+-------+
| id | select_type | table | partitions | type | possible_keys          | key                    | key_len | ref   | rows  | filtered | Extra |
+----+-------------+-------+------------+------+------------------------+------------------------+---------+-------+-------+----------+-------+
|  1 | SIMPLE      | test1 | NULL       | ref  | idx_org_tree_id_bank_3 | idx_org_tree_id_bank_3 | 75      | const | 23408 |   100.00 | NULL  |
+----+-------------+-------+------------+------+------------------------+------------------------+---------+-------+-------+----------+-------+
1 row in set, 1 warning (0.00 sec)

mysql> explain analyze select * from zhouqch.test1 where id_bank_3='PCT90';
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                               |
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Index lookup on test1 using idx_org_tree_id_bank_3 (id_bank_3='PCT90')  (cost=2461.55 rows=23408) (actual time=0.154..311.541 rows=46808 loops=1)
 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.32 sec)

mysql> explain analyze select * from zhouqch.test1  ignore index(idx_org_tree_id_bank_3) where id_bank_3='PCT90';
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                  |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Filter: (test1.id_bank_3 = 'PCT90')  (cost=4721.85 rows=23408) (actual time=0.129..248.786 rows=46808 loops=1)
    -> Table scan on test1  (cost=4721.85 rows=46816) (actual time=0.126..241.899 rows=46816 loops=1)
 |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.25 sec)

mysql>
[12 Jan 2021 13:10] MySQL Verification Team
Hi,

There are two different algorithms applied when index is scanned and when a lookup is made which involves fetching the row.

That is why a difference, as it is, is correctly evaluated.
[14 Jan 2021 8:35] Yan Huang
Hi MySQL Verification Team,

From the first explain plan provided by Zhou, the plan cost of "select *" is bigger by ((2461.55-2447.45)/2447.45 =)0.5%, but the actual time is bigger by ((147.081-7.047)/7.047 =) 1987%.

I think the explain plan may miscalculate the cost of fetching rows from cluster index.
[14 Jan 2021 13:34] MySQL Verification Team
Hi,

We were not able to reproduce it. However, that situation is quite possible when cost of scanning the large index  is higher than the cost of fetching a row by index lookup.

So, yes, that is a realistic possibility.
[15 Jan 2021 3:03] Yan Huang
Hi MySQL Verification Team,
Thank you for your comment, I agree that the cost of scanning index may be higher than index lookup.

Let me reorganize my points:

1. "select id_bank_3 from zhouqch.test where id_bank_3='PCT60'": use a covering index, its cost = (cost of index scan). 
2. "select * from zhouqch.test where id_bank_3='PCT60'": its cost = (cost of index scan) + (cost of fetching row from cluster)
3. ((cost of step 2) - (cost of step 1))/(cost of step 1) = (cost of fetching row from cluster)/(cost of index scan)
4. In execution plan, the result of step 3 is 0.5%
5. In reality, the result of step 3 is 1987%
6. I think the plan is far away from the real situation, which may lead incorrect plan decision
7. I think the result of step 4 should be higher, in other words, (the cost of fetching row from cluster) in plan should be higher than it is
[15 Jan 2021 14:06] MySQL Verification Team
Hi,

This is, actually, one of the border cases where mistakes  are not costly.