MySQL Bugs: #85771: The optimizer should ignore distinct for queries on the PK column(s)

Bug #85771	The optimizer should ignore distinct for queries on the PK column(s)
Submitted:	3 Apr 2017 21:47	Modified:	4 Apr 2017 15:12
Reporter:	monty solomon	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S4 (Feature request)
Version:		OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	5.7.12

Description:
The optimizer does not understand that the PK is unique and therefore the distinct is a no-op and can be ignored.

How to repeat:
CREATE TABLE `domain_crawled_times` (
  `domain` varchar(255) NOT NULL,
  `lastCrawledTimestamp` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`domain`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8

mysql> explain select count(distinct domain) from domain_crawled_times\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: domain_crawled_times
   partitions: NULL
         type: index
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 767
          ref: NULL
         rows: 24860758
     filtered: 100.00
        Extra: Using index
1 row in set, 1 warning (0.01 sec)

mysql> explain select count(domain) from domain_crawled_times\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: NULL
   partitions: NULL
         type: NULL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Select tables optimized away
1 row in set, 1 warning (0.00 sec)

Suggested fix:
Update optimizer code

Hi!

I fully agree with you and find it a very welcome feature request for our optimizer.

Verified.