Bug #15131 On-disk temporary tables created for GROUP BY should use indexes
Submitted: 22 Nov 2005 11:44 Modified: 31 Mar 2009 17:03
Reporter: Domas Mituzas Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S4 (Feature request)
Version:4.1,5.0 OS:Any
Assigned to: CPU Architecture:Any

[22 Nov 2005 11:44] Domas Mituzas
Description:
Lack of indexes for on-disk temporary tables created by GROUP BY provide serious performance degradation. 

SELECT a,b,SUM(c),SUM(d) FROM e GROUP BY a,b

if it hits disk, becomes especially slow, as it has to do table scans for updates. 

Creating a temporary table with UNIQUE constraint on (a,b) and using

INSERT INTO tmp SELECT a,b,c,d FROM e 
ON DUPLICATE KEY UPDATE c=c+VALUES(c),d=d+VALUES(d)

might provide a performance boost even over simple GROUP BY.

How to repeat:
Use any big table, decrease tmp_table_size, apply GROUP BY. 

Suggested fix:
Use indexing for on-disk temporary GROUP BY tables
[18 Aug 2007 0:42] Igor Babaev
- There is no test case for this bug entry.
- It's not clear what the reporter complains of.

By the above reasons I move the case to 'Open'
[31 Mar 2009 17:03] Valeriy Kravchuk
This is not a bug, but a feature request:

"Use indexing for on-disk temporary GROUP BY tables"

That is, if index is not used for queries like:

select a, sum(b) from t group by a;

and temporary table used does not fit into memory, on-disk temporary table should be created with index on a.