Bug #91139 use index dives less often
Submitted: 4 Jun 2018 20:37 Modified: 12 Jan 2021 19:39
Reporter: Mark Callaghan Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0.11 OS:Any
Assigned to: CPU Architecture:Any

[4 Jun 2018 20:37] Mark Callaghan
Description:
A larger value for eq_range_index_dive_limit can ruin QPS for in-memory workloads.

One of the sysbench tests that I run is oltp_inlist_select.lua for a database that fits in the InnoDB buffer pool. I run it twice -- first with a PK index (pk1) and then with a secondary index (pk0) on the id column. QPS for one connection is below. The reduction from pk1 to pk0 is much larger for 5.7.17 and 8.0.11

QPS ratios (QPS with secondary / QPS with PK) for 5.6.35 and 8.0.11
2555/4001
.638 --> 5.6.35

1206/3628
.332 --> 8.0.11

hot-points.pk1
qps   release
4001  5.6.35
3845  5.7.17
3628  8.0.11

hot-points.pk0
qps   release
2555  5.6.35
1206  5.7.17
1123  8.0.11

Repeats with my "random-points" test

QPS ratios (QPS with secondary / QPS with PK) for 5.6.35 and 8.0.11

1819 / 3101
.586 --> 5.6.35

760 / 2921
.260 --> 8.0.11

random-points.pk1
3101  5.6.35
3024  5.7.17
2921  8.0.11

random-points.pk0
1819  5.6.35
 784  5.7.17
 760  8.0.11

see http://smalldatum.blogspot.com/2017/02/using-modern-sysbench-to-compare.html for details on how I run sysbench

The root cause is change in the default for eq_range_index_dive_limit from 10 to 200 in 5.7

How to repeat:
Using sysbench from https://github.com/mdcallag/sysbench

sysbench --db-driver=mysql --mysql-user=root --mysql-password=pw --mysql-host=127.0.0.1 --mysql-db=test --mysql-storage-engine=innodb --range-size=100 --table-size=2000000 --tables=2 --threads=1 --events=0 --time=600 --rand-type=uniform --random-points=100 /sysbench10/share/sysbench/oltp_inlist_select.lua prepare

sysbench --db-driver=mysql --mysql-user=root --mysql-password=pw --mysql-host=127.0.0.1 --mysql-db=test --mysql-storage-engine=innodb --range-size=100 --table-size=2000000 --tables=2 --threads=1 --events=0 --time=600 --rand-type=uniform --random-points=100 /sysbench10/share/sysbench/oltp_inlist_select.lua run

Suggested fix:
Be careful about using so many index dives. In this case from SHOW INDEXES and SHOW TABLE STATUS I see (and the optimizer can also see) that the secondary index on the id column is (almost) unique so I don't understand the benefit of doing the index dives.

CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  KEY `xid` (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=2000001 DEFAULT CHARSET=latin1;

bin/mysql test -e 'explain SELECT c
             FROM sbtest1
             WHERE id IN (1000001, 1000002, 1000003, 1000004, 1000005, 1000006, 1000007, 1000008, 1000009, 1000010,
             1000011, 1000012, 1000013, 1000014, 1000015, 1000016, 1000017, 1000018, 1000019, 1000020, 
             1000021, 1000022, 1000023, 1000024, 1000025, 1000026, 1000027, 1000028, 1000029, 1000030,
             1000031, 1000032, 1000033, 1000034, 1000035, 1000036, 1000037, 1000038, 1000039, 1000040,
             1000041, 1000042, 1000043, 1000044, 1000045, 1000046, 1000047, 1000048, 1000049, 1000050,
             1000051, 1000052, 1000053, 1000054, 1000055, 1000056, 1000057, 1000058, 1000059, 1000060,
             1000061, 1000062, 1000063, 1000064, 1000065, 1000066, 1000067, 1000068, 1000069, 1000070,
             1000071, 1000072, 1000073, 1000074, 1000075, 1000076, 1000077, 1000078, 1000079, 1000080,
             1000081, 1000082, 1000083, 1000084, 1000085, 1000086, 1000087, 1000088, 1000089, 1000090,
             1000091, 1000092, 1000093, 1000094, 1000095, 1000096, 1000097, 1000098, 1000099, 1000100)'

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: sbtest1
   partitions: NULL
         type: range
possible_keys: xid
          key: xid
      key_len: 4
          ref: NULL
         rows: 100
     filtered: 100.00
        Extra: Using index condition

bin/mysql test -e 'show indexes from sbtest1\G'
mysql: [Warning] Using a password on the command line interface can be insecure.
*************************** 1. row ***************************
        Table: sbtest1
   Non_unique: 1
     Key_name: xid
 Seq_in_index: 1
  Column_name: id
    Collation: A
  Cardinality: 1921330
     Sub_part: NULL
       Packed: NULL
         Null: 
   Index_type: BTREE
      Comment: 
Index_comment: 
      Visible: YES

table status

*************************** 1. row ***************************
           Name: sbtest1
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 1921330
 Avg_row_length: 240
    Data_length: 462372864
Max_data_length: 0
   Index_length: 32047104
      Data_free: 5242880
 Auto_increment: 2000001
    Create_time: 2018-06-04 11:37:20
    Update_time: 2018-06-04 11:37:20
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options: 
        Comment:
[5 Jun 2018 13:08] MySQL Verification Team
Hello Mark,

Thank you for the report.
Verifying as a feature request. I built your sysbench's fork but noticed that the lua script(oltp_inlist_select.lua) used in your report is missing from the build. May I request you to please share it so that I can check this at my end? For now, I just created schema(sbest1 - with PK, with secondary index etc) and populated data  to capture explain SELECT..IN(), SHOW TABLE STATUS\G, SHOW INDEX FROM etc results: 

=============== 8.0.11
mysql> show create table sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  KEY `k_1` (`k`),
  KEY `id_index` (`id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=2000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

mysql> explain SELECT c              FROM sbtest1              WHERE id IN (1000001, 1000002, 1000003, 1000004, 1000005, 1000006, 1000007, 1000008, 1000009, 1000010,              1000011, 1000012, 1000013, 1000014, 1000015, 1000016, 1000017, 1000018, 1000019, 1000020,               1000021, 1000022, 1000023, 1000024, 1000025, 1000026, 1000027, 1000028, 1000029, 1000030,              1000031, 1000032, 1000033, 1000034, 1000035, 1000036, 1000037, 1000038, 1000039, 1000040,              1000041, 1000042, 1000043, 1000044, 1000045, 1000046, 1000047, 1000048, 1000049, 1000050,              1000051, 1000052, 1000053, 1000054, 1000055, 1000056, 1000057, 1000058, 1000059, 1000060,              1000061, 1000062, 1000063, 1000064, 1000065, 1000066, 1000067, 1000068, 1000069, 1000070,              1000071, 1000072, 1000073, 1000074, 1000075, 1000076, 1000077, 1000078, 1000079, 1000080,              1000081, 1000082, 1000083, 1000084, 1000085, 1000086, 1000087, 1000088, 1000089, 1000090,              1000091, 1000092, 1000093, 1000094, 1000095, 1000096, 1000097, 1000098, 1000099, 1000100);
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+
| id | select_type | table   | partitions | type  | possible_keys | key      | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | sbtest1 | NULL       | range | id_index      | id_index | 4       | NULL |  100 |   100.00 | Using index condition |
+----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+

mysql> show table status\G
*************************** 1. row ***************************
           Name: sbtest1
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 1922110
 Avg_row_length: 233
    Data_length: 449724416
Max_data_length: 0
   Index_length: 0
      Data_free: 5242880
 Auto_increment: 2000001
    Create_time: 2018-06-05 09:48:05
    Update_time: 2018-06-05 09:42:26
     Check_time: NULL
      Collation: utf8mb4_0900_ai_ci
       Checksum: NULL
 Create_options:
        Comment:

		mysql> show indexes from sbtest1\G
.

        Table: sbtest1
   Non_unique: 1
     Key_name: id_index
 Seq_in_index: 1
  Column_name: id
    Collation: A
  Cardinality: 1921601
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE
      Comment:
Index_comment:
      Visible: YES

Regards,
Umesh
[5 Jun 2018 14:00] Mark Callaghan
Sorry for the confusion. It is in my 1.0 branch, not master... https://github.com/mdcallag/sysbench/tree/1.0/src/lua
[5 Jun 2018 14:01] Mark Callaghan
In this case there is only one possible index to use. The optimizer doesn't need to evaluate the query to figure out how to evaluate the query.
[12 Jan 2021 19:39] Mark Callaghan
Forgot that I filed this and then I rediscovered it. See:
http://smalldatum.blogspot.com/2021/01/sysbench-in-memory-innodb-mysql-56-57.html

My Lua scripts are here:
https://github.com/mdcallag/sysbench/tree/1.0/src/lua

And I filed a bug for it in December:
https://bugs.mysql.com/bug.php?id=102037
[13 Jan 2021 11:59] Øystein Grøvlen
My proposal https://bugs.mysql.com/bug.php?id=99996 would help here.

If histograms are used instead of index statistics when eq_range_index_dive_limit is exceeded, one could lower eq_range_index_dive_limit without the risk of basing the optimization on bad statistics for longer IN lists.