| Bug #4040 | Extremely Slow "Where in (Select... )" Subqueries GROUP BY / HAVING related | ||
|---|---|---|---|
| Submitted: | 7 Jun 2004 15:54 | Modified: | 8 Jun 2004 15:37 |
| Reporter: | Andrew Spencer | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server | Severity: | S2 (Serious) |
| Version: | 4.1.1a-alpha-nt | OS: | Windows (Windows 2003 Server) |
| Assigned to: | Oleksandr Byelkin | CPU Architecture: | Any |
[8 Jun 2004 11:55]
Alexander Keremidarski
Thank you for taking the time to report a problem. Unfortunately you are not using a current version of the product your reported a problem with -- the problem might already be fixed. Please download a new version from http://www.mysql.com/downloads/ If you are able to reproduce the bug with one of the latest versions, please change the version on this bug report to the version you tested and change the status back to "Open". Again, thank you for your continued support of MySQL. Additional info: Subqueries are still under development. Some of them are optimized in 4.1.2, but not all.
[8 Jun 2004 15:37]
Oleksandr Byelkin
This query is slow because we rewrite IN via EXISTS like subquery and it become dependent (if you issue EXPLAIN EXTENDEND <your select> and SHOW WARNING then you will see internal structure of query after optimisation). You can make it a bit faster by removing DISTINCT. in 5.0/5.1 we plan make hash/temporary table optimization for such queries and it will become fast enough.
[3 Jun 2005 0:24]
ypirc ypirc
I experience this same exact problem on 5.0.2-alpha.
[12 May 2008 14:23]
Jeremy Pointer
Anyone else coming across this problem there is a way around it by using a named subquery as below:
Select count(key_id)
from patients
where key_id in
(
SELECT evsub.Patient_ID
FROM (
SELECT COUNT(Evals.Patient_ID) as cnt, Evals.Patient_ID
FROM Evals INNER JOIN Patients ON Patients.Key_ID = Evals.Patient_ID
WHERE
(Evals.Location_ID = 1)
AND (Patients.Center_ID = 1)
AND (Patients.key_id< 820)
GROUP BY Evals.Patient_ID
) as evsub
where evsub.cnt>1
)

Description: Subqueries are unusably slow. Identical query in MSSQL runs in .04 seconds. In MySQl it takes 82 seconds on the same machine. (Appears to have something to do with the GROUP BY / HAVING Statement). Query is simply using a Subquery to populate a WHERE statement and pull separate details from another table. *Performance improves (.06 seconds instead of 80 seconds) when HAVING statement in subquery removed. *Performance also improves separately if INNER JOIN in in subquery removed (7.61 seconds instead of 80 seconds) How to repeat: Tables: 1: "Patients" (1200 records) Single record per Patient Key_ID (primary Key: autoincrement) Center_ID (int) 2: "Evals" (3600 records) Multiple records per Patient Eval_ID (primary Key: autoincrement) Location_id (int) Patient_ID (int) Query Syntax: Select count(key_id) from patients where key_id in ( SELECT DISTINCT Evals.Patient_ID FROM Evals INNER JOIN Patients ON Patients.Key_ID = Evals.Patient_ID WHERE (Evals.Location_ID = 3) AND (Patients.Center_ID = 3) and (Patients.key_id < 820) GROUP BY Evals.Patient_ID HAVING (COUNT(Evals.Patient_ID) > 0) )