Bug #25130 Extremely Slow "Where in (Select... )" Subqueries GROUP BY / HAVING related
Submitted: 18 Dec 2006 5:46 Modified: 26 Mar 2012 19:41
Reporter: Ashleigh Gordon Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.0.34-BK, 5.0.27, 5.1, 5.5.0 OS:Windows (Windows, Sparc & Linux)
Assigned to: CPU Architecture:Any

[18 Dec 2006 5:46] Ashleigh Gordon
Description:
Similar to bug 4040, performance with subqueries with group by/having clauses is very slow. 

How to repeat:
Please see attached files for example tables and query, taken from bug 4040.
[18 Dec 2006 5:47] Ashleigh Gordon
Contains 3 files. Two to recreate tables and data, the other is an example of the problem query.

Attachment: Sample_SQL.zip (application/x-zip-compressed, text), 14.39 KiB.

[18 Dec 2006 22:46] Valeriy Kravchuk
Thank you for a problem report. Verified just as described with 5.0.34-BK on Linux:

mysql> explain extended Select count(key_id) from Patients where key_id in  ( S
ELECT DISTINCT Evals.Patient_ID FROM Evals INNER JOIN Patients ON Patients.Key_
ID = Evals.Patient_ID WHERE (Evals.Location_ID = 1) AND (Patients.Center_ID = 1
) and (Patients.key_id < 820)         GROUP BY Evals.Patient_ID         HAVING
     (COUNT(Evals.Patient_ID) > 0) )\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: Patients
         type: index
possible_keys: NULL
          key: PRIMARY
      key_len: 8
          ref: NULL
         rows: 1200
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: Evals
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3600
        Extra: Using where; Using temporary; Using filesort
*************************** 3. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: Patients
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: test.Evals.Patient_ID
         rows: 1
        Extra: Using where
3 rows in set, 1 warning (0.00 sec)

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select count(`test`.`Patients`.`Key_ID`) AS `count(key_id)` from `test`.`Patients` where <in_optimizer>(`test`.`Patients`.`Key_ID`,<exists>
(select distinct `test`.`Evals`.`Patient_ID` AS `Patient_ID` from `test`.`Evals` join `test`.`Patients` where ((`test`.`Patients`.`Center_ID` = 1) and (`test`.`Evals`.`Location_id` = 1) and (`test`.`Patients`.`Key_ID` < 820) and (`test`.`Patients`.`Key_ID` = `test`.`Evals`.`Patient_ID`))
 group by `test`.`Evals`.`Patient_ID` having ((count(`test`.`Evals`.`Patient_ID`) > 0) and (<cache>(`test`.`Patients`.`Key_ID`) = <ref_null_helper>(
`test`.`Evals`.`Patient_ID`)))))
1 row in set (0.00 sec)

mysql> Select count(key_id) from Patients where key_id in  ( SELECT DISTINCT Ev
als.Patient_ID FROM Evals INNER JOIN Patients ON Patients.Key_ID = Evals.Patien
t_ID WHERE (Evals.Location_ID = 1) AND (Patients.Center_ID = 1) and (Patients.k
ey_id < 820)         GROUP BY Evals.Patient_ID         HAVING      (COUNT(Evals
.Patient_ID) > 0) )\G
*************************** 1. row ***************************
count(key_id): 819
1 row in set (50.03 sec)

mysql> select version();
+--------------+
| version()    |
+--------------+
| 5.0.34-debug |
+--------------+
1 row in set (0.00 sec)

But, please, do not expect this problem to be fixed soon. Wait for MySQL 5.2. Work in progress.
[12 May 2008 14:21] Jeremy Pointer
Anyone else coming across this problem there is a way around it by using a named subquery as below:

Select count(key_id)
from patients
where key_id in 
(
SELECT evsub.Patient_ID
FROM (
SELECT COUNT(Evals.Patient_ID) as cnt, Evals.Patient_ID
 FROM Evals INNER JOIN	Patients ON Patients.Key_ID = Evals.Patient_ID
WHERE 
	(Evals.Location_ID = 1) 
	AND (Patients.Center_ID = 1) 
	AND (Patients.key_id< 820)
        GROUP BY Evals.Patient_ID
) as evsub
where evsub.cnt>1
)
[30 Nov 2009 19:12] Valeriy Kravchuk
This is fixed in 6.0.14:

77-52-1-11:6.0-codebase openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 6.0.14-alpha-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> Select count(key_id) from Patients
    -> where key_id in 
    -> (SELECT DISTINCT Evals.Patient_ID
    -> FROM Evals INNER JOIN
    -> Patients ON Patients.Key_ID = Evals.Patient_ID
    -> WHERE (Evals.Location_ID = 1) AND (Patients.Center_ID = 1) and (Patients.key_id< 820)
    -> GROUP BY Evals.Patient_ID
    -> HAVING COUNT(Evals.Patient_ID) > 0)
    -> ;
+---------------+
| count(key_id) |
+---------------+
|           819 |
+---------------+
1 row in set (0.10 sec)

mysql> explain Select count(key_id) from Patients
    -> where key_id in 
    -> (SELECT DISTINCT Evals.Patient_ID
    -> FROM Evals INNER JOIN
    -> Patients ON Patients.Key_ID = Evals.Patient_ID
    -> WHERE (Evals.Location_ID = 1) AND (Patients.Center_ID = 1) and (Patients.key_id< 820)
    -> GROUP BY Evals.Patient_ID
    -> HAVING COUNT(Evals.Patient_ID) > 0)
    -> ;
+----+-------------+----------+--------+---------------+---------+---------+-----------------------+------+----------------------------------------------+
| id | select_type | table    | type   | possible_keys | key     | key_len | ref                   | rows | Extra                                        |
+----+-------------+----------+--------+---------------+---------+---------+-----------------------+------+----------------------------------------------+
|  1 | PRIMARY     | Patients | index  | NULL          | PRIMARY | 8       | NULL                  | 1200 | Using where; Using index                     |
|  2 | SUBQUERY    | Evals    | ALL    | NULL          | NULL    | NULL    | NULL                  | 3600 | Using where; Using temporary; Using filesort |
|  2 | SUBQUERY    | Patients | eq_ref | PRIMARY       | PRIMARY | 8       | test.Evals.Patient_ID |    1 | Using index condition; Using where           |
+----+-------------+----------+--------+---------------+---------+---------+-----------------------+------+----------------------------------------------+
3 rows in set (0.00 sec) 

but NOT fixed in 5.5.0 or any other 5..x.y:

77-52-1-11:trunk openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.5.0-beta-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> Select count(key_id) from Patients
    -> where key_id in 
    -> (SELECT DISTINCT Evals.Patient_ID
    -> FROM Evals INNER JOIN
    -> Patients ON Patients.Key_ID = Evals.Patient_ID
    -> WHERE (Evals.Location_ID = 1) AND (Patients.Center_ID = 1) and (Patients.key_id< 820)
    -> GROUP BY Evals.Patient_ID
    -> HAVING COUNT(Evals.Patient_ID) > 0)
    -> ;
+---------------+
| count(key_id) |
+---------------+
|           819 |
+---------------+
1 row in set (1 min 6.71 sec)

mysql> explain Select count(key_id) from Patients
    -> where key_id in 
    -> (SELECT DISTINCT Evals.Patient_ID
    -> FROM Evals INNER JOIN
    -> Patients ON Patients.Key_ID = Evals.Patient_ID
    -> WHERE (Evals.Location_ID = 1) AND (Patients.Center_ID = 1) and (Patients.key_id< 820)
    -> GROUP BY Evals.Patient_ID
    -> HAVING COUNT(Evals.Patient_ID) > 0)
    -> ;
+----+--------------------+----------+--------+---------------+---------+---------+-----------------------+------+----------------------------------------------+
| id | select_type        | table    | type   | possible_keys | key     | key_len | ref                   | rows | Extra                                        |
+----+--------------------+----------+--------+---------------+---------+---------+-----------------------+------+----------------------------------------------+
|  1 | PRIMARY            | Patients | index  | NULL          | PRIMARY | 8       | NULL                  | 1200 | Using where; Using index                     |
|  2 | DEPENDENT SUBQUERY | Evals    | ALL    | NULL          | NULL    | NULL    | NULL                  | 3600 | Using where; Using temporary; Using filesort |
|  2 | DEPENDENT SUBQUERY | Patients | eq_ref | PRIMARY       | PRIMARY | 8       | test.Evals.Patient_ID |    1 | Using where                                  |
+----+--------------------+----------+--------+---------------+---------+---------+-----------------------+------+----------------------------------------------+
3 rows in set (0.36 sec)
[26 Mar 2012 19:41] Paul DuBois
Noted in 5.6.5 changelog.

Several subquery performance issues were resolved through the
implementation of semi-join subquery optimization strategies.