Bug #116710 explain shows wrong cost value
Submitted: 19 Nov 2024 12:47 Modified: 20 Nov 2024 2:55
Reporter: Woson Wong Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: cost, EXPLAIN FORMAT=JSON, Optimizer

[19 Nov 2024 12:47] Woson Wong
Description:
When i use explain format=json to get the execution plan for sql:
explain format=json select * from test_ddl  where  addr="hangzhou" order by gmt_create  limit 1;

Result:
{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "2.15"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "test_ddl",
        "access_type": "index",
        "possible_keys": [
          "idx_addr_name_name1"
        ],
        "key": "idx_gmt_create",
        "used_key_parts": [
          "gmt_create"
        ],
        "key_length": "5",
        "rows_examined_per_scan": 2,
        "rows_produced_per_join": 14,
        "filtered": "35.90",
        "cost_info": {
          "read_cost": "0.75",
          "eval_cost": "1.40",
          "prefix_cost": "2.15",
          "data_read_per_join": "1K"
        },
        "used_columns": [
          "id",
          "addr",
          "name1",
          "name",
          "gmt_create"
        ],
        "attached_condition": "(`changmodb`.`test_ddl`.`addr` = 'hangzhou')"
      }
    }
  }
}

And if i use force index to explain:
explain format=json select * from test_ddl force index (idx_gmt_create) where  addr="hangzhou" order by gmt_create  limit 1;
Result:
{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "13.90"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "test_ddl",
        "access_type": "index",
        "key": "idx_gmt_create",
        "used_key_parts": [
          "gmt_create"
        ],
        "key_length": "5",
        "rows_examined_per_scan": 1,
        "rows_produced_per_join": 19,
        "filtered": "10.00",
        "cost_info": {
          "read_cost": "11.95",
          "eval_cost": "1.95",
          "prefix_cost": "13.90",
          "data_read_per_join": "2K"
        },
        "used_columns": [
          "id",
          "addr",
          "name1",
          "name",
          "gmt_create"
        ],
        "attached_condition": "(`changmodb`.`test_ddl`.`addr` = 'hangzhou')"
      }
    }
  }
}

Both queries use the same index (idx_gmt_create), but the cost values is different.

By the way, the OPTIMIZER TRACE shows that the "query_cost: 2.15" value is from the other index(idx_addr_name_name1), at the "considered_execution_plans" part:

How to repeat:
Table info:
CREATE TABLE `test_ddl` (
  `id` int NOT NULL AUTO_INCREMENT,
  `addr` varchar(10) NOT NULL DEFAULT 'hangzhou',
  `name1` varchar(10) NOT NULL DEFAULT 'a',
  `name` varchar(10) DEFAULT NULL,
  `gmt_create` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `idx_addr_name_name1` (`addr`),
  KEY `idx_name1` (`name1`),
  KEY `idx_gmt_create` (`gmt_create`)
) ENGINE=InnoDB AUTO_INCREMENT=40 DEFAULT CHARSET=utf8mb4;

Data:
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (1,'hang111','a','','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (2,'hangzhou','a','','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (3,'hangzhou','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (4,'hangzhou','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (5,'hangzhou','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (6,'hangzhou','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (7,'hangzhou','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (8,'hangzhou','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (9,'hangzhou','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (10,'hangzhou','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (11,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (12,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (13,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (14,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (15,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (16,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (17,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (18,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (19,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (20,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (21,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (22,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (23,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (24,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (25,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (26,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (27,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (28,'hang111','b','a','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (29,'hang111','b1','a1','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (30,'hang111','b1','a1','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (31,'hang111','b1','a1','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (32,'hang111','b1','a1','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (33,'hang111','b1','a1','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (34,'hang111','b1','a1','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (35,'hangzhou','b1','a1','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (36,'hangzhou','b1','a1','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (37,'hangzhou','b1','a1','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (38,'hangzhou','b1','a1','2024-11-19 16:32:01');
INSERT INTO `test_ddl` (`id`,`addr`,`name1`,`name`,`gmt_create`) VALUES (39,'hangzhou','b1','a1','2024-11-19 16:32:01');

Suggested fix:
EXPLAIN command should show the cost value of the chosen index.
[19 Nov 2024 12:48] Woson Wong
OPTIMIZER_TRACE result

Attachment: OPTIMIZER_TRACE.txt (text/plain), 8.29 KiB.

[19 Nov 2024 14:36] MySQL Verification Team
Hi Mr. Wong,

Thank you for your bug report.

This is truly a bug , but only a bug with the EXPLAIN in JSON format.

Other forms of EXPLAIN work just fine:

EXPLAIN
-> Limit: 1 row(s)  (cost=0.0421 rows=0.718)\n    -> Filter: (test_ddl.addr = 'hangzhou')  (cost=0.0421 rows=0.718)\n        -> Index scan on test_ddl using idx_gmt_create  (cost=0.0421 rows=2)\n
EXPLAIN
-> Limit: 1 row(s)  (cost=0.0613 rows=0.1)\n    -> Filter: (test_ddl.addr = 'hangzhou')  (cost=0.0613 rows=0.1)\n        -> Index scan on test_ddl using idx_gmt_create  (cost=0.0613 rows=1)\n
EXPLAIN
-> Limit: 1 row(s)  (cost=0.0421 rows=0.718) (actual time=7.16..7.16 rows=1 loops=1)\n    -> Filter: (test_ddl.addr = 'hangzhou')  (cost=0.0421 rows=0.718) (actual time=0.929..0.929 rows=1 loops=1)\n        -> Index scan on test_ddl using idx_gmt_create  (cost=0.0421 rows=2) (actual time=0.033..0.034 rows=2 loops=1)\n
EXPLAIN
-> Limit: 1 row(s)  (cost=0.0613 rows=0.1) (actual time=0.0202..0.0204 rows=1 loops=1)\n    -> Filter: (test_ddl.addr = 'hangzhou')  (cost=0.0613 rows=0.1) (actual time=0.0196..0.0196 rows=1 loops=1)\n        -> Index scan on test_ddl using idx_gmt_create  (cost=0.0613 rows=1) (actual time=0.0138..0.0172 rows=2 loops=1)\n

Hence, this is a very low priority bug.

Verified for 8.0 and all higher and supported versions.
[20 Nov 2024 2:55] Woson Wong
I test "explain FORMAT=TREE" and "explain analyze", also get different cost in force index situation.
Both queries use the same index, but get different cost values in explain result.

1.1 explain analyze select * from test_ddl force index (idx_gmt_create) where  addr="hangzhou" order by gmt_create  limit 1;
Result:
| -> Limit: 1 row(s)  (cost=0.0613 rows=0.1) (actual time=0.383..0.383 rows=1 loops=1)
    -> Filter: (test_ddl.addr = 'hangzhou')  (cost=0.0613 rows=0.1) (actual time=0.365..0.365 rows=1 loops=1)
        -> Index scan on test_ddl using idx_gmt_create  (cost=0.0613 rows=1) (actual time=0.327..0.341 rows=2 loops=1)
 |

1.2 explain analyze select * from test_ddl where  addr="hangzhou" order by gmt_create  limit 1;
Result:
| -> Limit: 1 row(s)  (cost=0.0421 rows=0.718) (actual time=0.441..0.441 rows=1 loops=1)
    -> Filter: (test_ddl.addr = 'hangzhou')  (cost=0.0421 rows=0.718) (actual time=0.357..0.357 rows=1 loops=1)
        -> Index scan on test_ddl using idx_gmt_create  (cost=0.0421 rows=2) (actual time=0.312..0.313 rows=2 loops=1)
 |

2.1 explain FORMAT=TREE select * from test_ddl force index (idx_gmt_create) where  addr="hangzhou" order by gmt_create  limit 1;
Result:
| -> Limit: 1 row(s)  (cost=0.0613 rows=0.1)
    -> Filter: (test_ddl.addr = 'hangzhou')  (cost=0.0613 rows=0.1)
        -> Index scan on test_ddl using idx_gmt_create  (cost=0.0613 rows=1)
 |

2.2 explain FORMAT=TREE select * from test_ddl  where  addr="hangzhou" order by gmt_create  limit 1;
Result:
| -> Limit: 1 row(s)  (cost=0.0421 rows=0.718)
    -> Filter: (test_ddl.addr = 'hangzhou')  (cost=0.0421 rows=0.718)
        -> Index scan on test_ddl using idx_gmt_create  (cost=0.0421 rows=2)
 |
[20 Nov 2024 11:02] MySQL Verification Team
Thank you, Mr. Wong.