Bug #98055 MySQL Optimizer Bug not picking right index
Submitted: 23 Dec 2019 11:37 Modified: 23 Jan 2020 18:35
Reporter: sudheer gadipathi Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.7 OS:Red Hat (amezon RDS)
Assigned to: CPU Architecture:Any

[23 Dec 2019 11:37] sudheer gadipathi
Description:

Table structure:

 CREATE TABLE `ht` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `rid` bigint(20) DEFAULT NULL,
  `sm` tinyint(1) DEFAULT '0',
  `dd` tinyint(1) DEFAULT '0',
  `aid` bigint(20) NOT NULL DEFAULT '0',
  `did` bigint(20) DEFAULT NULL,
  `gid` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`,`aid`),
  UNIQUE KEY `index_ht_aid_and_did` (`aid`,`did`),
  KEY `index_ht_on_aid_and_gid` (`aid`,`gid`)
) ENGINE=InnoDB AUTO_INCREMENT=13129695630 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
/*!50100 PARTITION BY HASH (`aid`)
PARTITIONS 128 */;

Query:
SELECT   COUNT(*) FROM   ht f WHERE       ht.`aid` = 209058   AND (gid         IN  (0,6000200468))   and gid         IN  ('6000200468')   AND ht.sm = 0   AND ht.dd = 0   AND ht.sm = 0   AND ht.dd = 0 )\G

Problem : 

There is (aid,did) unique key and it is picking even there is no did column in the where clause. (aid,gid) columns are in where clause and there is an index with them but not picking it even we run analyze and optimize . 

Selecting wring index even there is no did in there where clause: 

mysql> explain
    -> SELECT
    ->   COUNT(*)
    -> FROM
    ->   ht
    -> WHERE
    ->   ht.`aid` = 209058
    ->   AND (gid
    ->         IN  (0,6000200468))
    ->   AND ( ht.gid
    ->         IN  ('6000200468')
    ->   AND ht.sm = 0
    ->   AND ht.dd = 0
    ->   AND ht.sm = 0
    ->   AND ht.dd = 0 )\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ht
   partitions: p34
         type: ref
possible_keys: index_ht_aid_and_did,index_ht_on_aid_and_gid
          key: index_ht_aid_and_did
      key_len: 8
          ref: const
         rows: 5641273
     filtered: 0.10
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

mysql> SELECT
    ->   COUNT(*)
    -> FROM
    ->   ht
    -> WHERE
    ->   ht.`aid` = 209058
    ->   AND (gid
    ->         IN  (0,6000200468))
    ->   AND ( ht.gid
    ->         IN  ('6000200468')
    ->   AND ht.sm = 0
    ->   AND ht.dd = 0
    ->   AND ht.sm = 0
    ->   AND ht.dd = 0 )\G
*************************** 1. row ***************************
COUNT(*): 5056319
1 row in set (30.59 sec)

With right index: 

mysql> explain
    -> SELECT
    ->   COUNT(*)
    -> FROM
    ->   ht force index (index_helpdesk_tickets_on_aid_and_gid)
    -> WHERE
    ->   ht.`aid` = 209058
    ->   AND (gid
    ->         IN  (0,6000200468))
    ->   AND ( ht.gid
    ->         IN  ('6000200468')
    ->   AND ht.sm = 0
    ->   AND ht.dd = 0
    ->   AND ht.sm = 0
    ->   AND ht.dd = 0 )\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ht
   partitions: p34
         type: ref
possible_keys: index_ht_on_aid_and_gid
          key: index_ht_on_aid_and_gid
      key_len: 17
          ref: const,const
         rows: 5641273
     filtered: 1.00
        Extra: Using index condition; Using where
1 row in set, 1 warning (0.01 sec)

mysql>  SELECT  COUNT(*) FROM   ht force index (index_ht_on_aid_and_gid) WHERE       ht.`aid` = 209058   AND (gid         IN  (0,6000200468))   Agid         IN  ('6000200468')   AND ht.sm = 0   AND ht.dd = 0   AND ht.sm = 0   AND ht.dd = 0 )\G
*************************** 1. row ***************************
COUNT(*): 5056319

1 row in set (9.77 sec)

How to repeat:
you can reproduce it with random values. 

Suggested fix:
It looks like there needs to fix this optimizer bug for this usecase.
[13 Jan 2020 13:19] MySQL Verification Team
Hi Mr. gadipathi,

Thank you for your bug report.

This looks like an optimiser imperfection in 5.7.

Before you send us data, could you check whether this bug was fixed in 8.0. You can use our latest version 8.0.19. There are so many optimiser bugs fixed between 5.7 and latest 8.0 that we do not feel like hunting for the already fixed bugs.

However, if you find that 8.0 performs equally bad, then, please do upload us the data. You can use "Files" tab on this page. If your data is too large for that, you will find precise instructions on how to upload your data to our SFTP site.

Our SFTP site is visible and accessible only to Oracle employees, so your data will be totally safe.

Thanks in advance.
[22 Jan 2020 11:12] sudheer gadipathi
Even same issue on MySQL 8.0 version 

mysql> SELECT   COUNT(*) FROM   ht WHERE       ht.`aid` = 209058   AND (gid        IN  (0,6000200468))   and gid         IN  ('6000200468')   AND ht.sm = 0   AND ht.dd = 0   AND ht.sm = 0   AND ht.dd = 0\G
*************************** 1. row ***************************
COUNT(*): 5056319
1 row in set (1 min 5.93 sec)

mysql> explain SELECT   COUNT(*) FROM   ht WHERE       ht.`aid` = 209058   AND (gid        IN  (0,6000200468))   and gid         IN  ('6000200468')   AND ht.sm = 0   AND ht.dd
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ht
   partitions: p34
         type: ref
possible_keys: index_ht_aid_and_did,index_helpdesk_tickets_on_aid_and_gid
          key: index_ht_aid_and_did
      key_len: 8
          ref: const
         rows: 5321153
     filtered: 0.10
        Extra: Using where
1 row in set, 1 warning (0.04 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.19    |
+-----------+
1 row in set (0.00 sec)
[22 Jan 2020 14:14] MySQL Verification Team
Hello Mr. gadipathi,

Seems that problem persists in 8.0. This could be a bug in the partitions handling. Probably without partitions, the right index would be chosen.

Anyway, would you be so kind to provide us with data for that table. Please, upload the data by using "Files" tab of this page. If data is too large, you can upload your compressed data to our SFTP site, as described in the "Files" tab.

Let us know when it is done.
[23 Jan 2020 6:55] sudheer gadipathi
uploaded the same data. 

file location  /support/incoming/ht_209058.sql.gz
[23 Jan 2020 13:33] MySQL Verification Team
Thank you.

I am analysing your test case now.
[23 Jan 2020 16:52] MySQL Verification Team
Hi Mr. gadipathi,

I have ran your queries 3 times each and took the average value.

Without FORCE INDEX a time is :	19m49.843s

With FORCE INDEX a time is: 9m21.780s

This is actually a small imperfection in our optimiser, but still it makes a bug.

Verified as reported.
[23 Jan 2020 18:35] sudheer gadipathi
Thanks for update. Please let us know once it is fixed in any patch
[23 Jan 2020 18:35] sudheer gadipathi
Thanks for the update. Please let us know once it is fixed in any patch
[24 Jan 2020 13:03] MySQL Verification Team
Just couple of additional notes.

The results are same regardless of whether table is partitioned or not. This is why testing took so much time.

Last, when this bug is fixed, this page will be updated by our Development, which means that you will be notified.