| Bug #41994 | Optimizer do full table scan when use distinct with group by in subquery | ||
|---|---|---|---|
| Submitted: | 9 Jan 2009 11:15 | Modified: | 9 Mar 2015 18:58 |
| Reporter: | Ben Li | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
| Version: | 5.1.30, 5.1.31, 6.0.8 | OS: | Any (test on linux/windows) |
| Assigned to: | Evgeny Potemkin | CPU Architecture: | Any |
| Tags: | distinct subquery groupby | ||
[9 Jan 2009 11:16]
Ben Li
mysqldump file
Attachment: distinct_bug.sql (text/plain), 326.06 KiB.
[9 Jan 2009 16:46]
Valeriy Kravchuk
Thank you for a problem report. Verified just as described on your test data, also - with 6.0.8 on Windows:
C:\Program Files\MySQL\MySQL Server 6.0\bin>mysql -uroot -proot -P3311 test
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 28
Server version: 6.0.8-alpha-community MySQL Community Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> explain select ipid, count(distinct uid) from distinct_bug where defid =
3 and day
-> between '2008-10-01' and '2008-10-31' group by ipid \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: distinct_bug
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 7
ref: NULL
rows: 235
Extra: Using where; Using filesort
1 row in set (0.14 sec)
mysql> explain select * from (
-> select ipid, count(distinct uid) from distinct_bug where defid = 3 and da
y
-> between '2008-10-01' and '2008-10-31' group by ipid) tmp\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: <derived2>
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 198
Extra:
*************************** 2. row ***************************
id: 2
select_type: DERIVED
table: distinct_bug
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 6982
Extra: Using filesort
2 rows in set (0.06 sec)
Table's engine does NOT matter.
[22 Aug 2014 20:11]
Justin Swanhart
Not reproducible on 5.6.17
[22 Aug 2014 20:16]
Justin Swanhart
Not reproducible on 5.5.37 either
[9 Mar 2015 18:58]
Roy Lyseng
Bug is no longer reproducible.

Description: A simple table: CREATE TABLE `distinct_bug` ( `defid` int(10) unsigned NOT NULL, `day` date NOT NULL, `uid` bigint(20) unsigned NOT NULL, `start_min` smallint(5) unsigned NOT NULL, `ipid` int(10) unsigned NOT NULL, PRIMARY KEY (`defid`,`day`,`uid`,`start_min`) ) ENGINE=InnoDB; When do a query like: select count(distinct uid) from distinct_bug where defid = 3 and day between between '2008-10-01' and '2008-10-31'; MySQL use index correctlly, and it's a ranged scan. But when put the query int a subquery, MySQL do full table scan. select * from (select count(distinct uid) from distinct_bug where defid = 3 and day between between '2008-10-01' and '2008-10-31') tmp; This bug exists in innodb/myisam/falcon, in 5.1.30/5.1.31/6.0.5-windows. How to repeat: load the attachment distinct_bug.sql into test database mysql> select count(1) from distinct_bug \G count(1): 6474 mysql> select count(1) from distinct_bug where defid = 3 and day between '2008-10-01' and '2008-10-31' \G count(1): 235 Only 235 line in range mysql> desc select ipid, count(distinct uid) from distinct_bug where defid = 3 and day between '2008-10-01' and '2008-10-31' group by ipid \G id: 1 select_type: SIMPLE table: distinct_bug type: range possible_keys: PRIMARY key: PRIMARY key_len: 7 ref: NULL rows: 235 Extra: Using where; Using filesort mysql> desc select * from ( select ipid, count(distinct uid) from distinct_bug where defid = 3 and day between '2008-10-01' and '2008-10-31' group by ipid ) tmp \G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 198 Extra: *************************** 2. row *************************** id: 2 select_type: DERIVED table: distinct_bug type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 6931 Extra: Using filesort 2 rows in set (0.00 sec) It seems MySQL do a full table scan! mysql> desc select * from ( select ipid, count(uid) from distinct_bug where defid = 3 and day between '2008-10-01' and '2008-10-31' group by ipid ) tmp \G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 198 Extra: *************************** 2. row *************************** id: 2 select_type: DERIVED table: distinct_bug type: range possible_keys: PRIMARY key: PRIMARY key_len: 7 ref: NULL rows: 235 Extra: Using where; Using temporary; Using filesort 2 rows in set (0.00 sec) Without distinct, it's all right. I've to use sql below as a workaround desc select * from ( select ipid, count(uid) as uv from ( select ipid, uid from distinct_bug where defid = 3 and day between '2008-10-01' and '2008-10-31' group by uid ) uidg group by ipid having uv > 1 ) tmp \G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 13 Extra: *************************** 2. row *************************** id: 2 select_type: DERIVED table: <derived3> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 193 Extra: Using temporary; Using filesort *************************** 3. row *************************** id: 3 select_type: DERIVED table: distinct_bug type: range possible_keys: PRIMARY key: PRIMARY key_len: 7 ref: NULL rows: 235 Extra: Using where; Using temporary; Using filesort 3 rows in set (0.00 sec) At least it don't do a full table scan.