Bug #108159 A query with user-defined variables always use the table scan.
Submitted: 17 Aug 2022 2:52 Modified: 18 Aug 2022 2:13
Reporter: biao li Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S5 (Performance)
Version:MySQL 8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: Optimizer, table scan, User-Defined Variables

[17 Aug 2022 2:52] biao li
Description:
A query with user-defined variables maybe not use an index to scan records in MySQL 8.0(including MySQL 8.0.30), although there is a suitable index for the query condition.

Query in MySQL 8.0: select * from (SELECT @r AS _id, (SELECT @r := sup_local_code FROM p_locations a WHERE a.local_code = _id limit 1) AS sup_local_code, (SELECT a.local_name FROM p_locations a WHERE a.local_code = _id limit 1) AS sup_node_name, (SELECT a.lv FROM p_locations a WHERE a.local_code = _id limit 1) AS sup_lv FROM (SELECT @r :='442000001000') as vars, p_locations  AS h) as t where t.sup_lv > -1;

The query always scans records using the table scan, which exhibits poor performance.

How to repeat:
CREATE TABLE `p_locations` (
  `id` varchar(64) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '' COMMENT 'primary',
  `flag` varchar(6) DEFAULT NULL,
  `local_code` varchar(30) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL COMMENT 'code',
  `local_name` varchar(300) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL COMMENT 'name',
  `lv` int DEFAULT NULL COMMENT 'levle',
  `sup_local_code` varchar(30) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL COMMENT 'parent code',
  `url` varchar(60) DEFAULT NULL,
  `path` varchar(1000) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index_p_locations_id` (`id`),
  KEY `index_p_locations_local_code` (`local_code`),
  KEY `index_p_locations_local_name` (`local_name`),
  KEY `index_p_locations_sup_local_code` (`sup_local_code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8; 

In MySQL 8.0:
explain select * from (SELECT @r AS _id, (SELECT @r := sup_local_code FROM p_locations a WHERE a.local_code = _id limit 1) AS sup_local_code, (SELECT a.local_name FROM p_locations a WHERE a.local_code = _id limit 1) AS sup_node_name, (SELECT a.lv FROM p_locations a WHERE a.local_code = _id limit 1) AS sup_lv FROM (SELECT @r :='442000001000') as vars, p_locations  AS h) as t where t.sup_lv > -1;

+----+--------------------+------------+------------+--------+------------------------------+------------------------------+---------+------+------+----------+----------------+
| id | select_type        | table      | partitions | type   | possible_keys                | key                          | key_len | ref  | rows | filtered | Extra          |
+----+--------------------+------------+------------+--------+------------------------------+------------------------------+---------+------+------+----------+----------------+
|  1 | PRIMARY            | <derived2> | NULL       | ALL    | NULL                         | NULL                         | NULL    | NULL |    2 |    50.00 | Using where    |
|  2 | DERIVED            | <derived6> | NULL       | system | NULL                         | NULL                         | NULL    | NULL |    1 |   100.00 | NULL           |
|  2 | DERIVED            | h          | NULL       | index  | NULL                         | index_p_locations_local_code | 93      | NULL |    1 |   100.00 | Using index    |
|  6 | DERIVED            | NULL       | NULL       | NULL   | NULL                         | NULL                         | NULL    | NULL | NULL |     NULL | No tables used |
|  5 | DEPENDENT SUBQUERY | a          | NULL       | ALL    | index_p_locations_local_code | NULL                         | NULL    | NULL |    1 |   100.00 | Using where    |
|  4 | DEPENDENT SUBQUERY | a          | NULL       | ALL    | index_p_locations_local_code | NULL                         | NULL    | NULL |    1 |   100.00 | Using where    |
|  3 | DEPENDENT SUBQUERY | a          | NULL       | ALL    | index_p_locations_local_code | NULL                         | NULL    | NULL |    1 |   100.00 | Using where    |
+----+--------------------+------------+------------+--------+------------------------------+------------------------------+---------+------+------+----------+----------------+

In MySQL 5.7:

mysql> explain select * from (SELECT @r AS _id, (SELECT @r := sup_local_code FROM p_locations a WHERE a.local_code = _id limit 1) AS sup_local_code, (SELECT a.local_name FROM p_locations a WHERE a.local_code = _id limit 1) AS sup_node_name, (SELECT a.lv FROM p_locations a WHERE a.local_code = _id limit 1) AS sup_lv FROM (SELECT @r:='442000001000') as vars, p_locations  AS h) as t where t.sup_lv > -1;
+----+--------------------+------------+------------+--------+------------------------------+------------------------------+---------+------+------+----------+-----------------------+
| id | select_type        | table      | partitions | type   | possible_keys                | key                          | key_len | ref  | rows | filtered | Extra                 |
+----+--------------------+------------+------------+--------+------------------------------+------------------------------+---------+------+------+----------+-----------------------+
|  1 | PRIMARY            | <derived2> | NULL       | ALL    | NULL                         | NULL                         | NULL    | NULL |    2 |    50.00 | Using where           |
|  2 | DERIVED            | <derived6> | NULL       | system | NULL                         | NULL                         | NULL    | NULL |    1 |   100.00 | NULL                  |
|  2 | DERIVED            | h          | NULL       | index  | NULL                         | index_p_locations_local_code | 93      | NULL |    1 |   100.00 | Using index           |
|  6 | DERIVED            | NULL       | NULL       | NULL   | NULL                         | NULL                         | NULL    | NULL | NULL |     NULL | No tables used        |
|  5 | DEPENDENT SUBQUERY | a          | NULL       | ref    | index_p_locations_local_code | index_p_locations_local_code | 93      | func |    1 |   100.00 | Using index condition |
|  4 | DEPENDENT SUBQUERY | a          | NULL       | ref    | index_p_locations_local_code | index_p_locations_local_code | 93      | func |    1 |   100.00 | Using index condition |
|  3 | DEPENDENT SUBQUERY | a          | NULL       | ref    | index_p_locations_local_code | index_p_locations_local_code | 93      | func |    1 |   100.00 | Using index condition |
+----+--------------------+------------+------------+--------+------------------------------+------------------------------+---------+------+------+----------+-----------------------+
[17 Aug 2022 2:53] biao li
optimze trace in MySQL 8.0

Attachment: trace.txt (text/plain), 41.40 KiB.

[17 Aug 2022 13:34] MySQL Verification Team
Hi Mr. li,

Thank you very much for your bug report.

However, it is not a bug.

When a table has a single row, it is much faster to scan table then to use index. That is a very well known fact. Actually, depending on the entity's definition, it is faster to scan 20 - 50 rows then to use index.

Not a bug.
[18 Aug 2022 2:13] biao li
Hi, in this commit, I only give table definitions to account for this.

In fact, in my release environment, I have hundreds of thousands of records in table p_locations.

For MySQL 5.7, the database only scans one record, but for MySQL 8.0, it scans one billion records.

Maybe later, I can give you a script to fill the records in the table.
[18 Aug 2022 11:44] MySQL Verification Team
Hi Mr. li,

No, there is no need for you to send us any script.

It would suffice that you send us the outputs from 5.7 and 8.0 with a commands:

EXPLAIN SELECT ....

EXPLAIN EXTENDED SELECT ....

EXPLAIN ANALYZE SELECT ....

Also, try to make a test case without user variable. Simply, MySQL does not guarantee when will user variable be evaluated, nor in which order it will be assigned. This caused problems in 5.7, which is why in 8.0 there were changes made in the optimiser.

Still, not a bug .....