Bug #120477 Index Not Used for HAVING Without GROUP BY or aggregate functions
Submitted: 14 May 15:50 Modified: 18 May 7:19
Reporter: jinhui lai Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:9.7.0 OS:Any
Assigned to: CPU Architecture:Any

[14 May 15:50] jinhui lai
Description:
Hi, MySQL developers. I found a missed optimization in MySQL.
When a SELECT statement uses a HAVING clause without any GROUP BY or aggregate functions, the condition is semantically equivalent to a WHERE clause. However, the optimizer does not transform the HAVING condition into a table-access filter that can exploit indexes. Instead, it performs a full table scan and applies the filter afterward, leading to orders‑of‑magnitude worse performance compared to using an equivalent WHERE.

How to repeat:
Steps to Reproduce:

DELIMITER //
CREATE PROCEDURE batch_insert_numbers()
BEGIN
    DECLARE i INT DEFAULT 1;
    DECLARE batch_size INT DEFAULT 1000;
    
    WHILE i <= 1000000 DO
        START TRANSACTION;
        WHILE i <= 1000000 AND batch_size > 0 DO
            INSERT INTO t0(c0) VALUES (i);
            SET i = i + 1;
            SET batch_size = batch_size - 1;
        END WHILE;
        COMMIT;
        SET batch_size = 1000;
    END WHILE;
END //
DELIMITER ;

CREATE TABLE t0(c0 INT PRIMARY KEY, c1 FLOAT);
CALL batch_insert_numbers(); -- insert into t0(c0) with values: 1-1000000

SELECT * FROM t0 WHERE t0.c0 = 1; -- result: {1|0}; time: 0.001 sec

EXPLAIN SELECT * FROM t0 WHERE t0.c0 = 1 \G
*************************** 1. row ***************************
EXPLAIN: -> Rows fetched before execution  (cost=0..0 rows=1)

SELECT * FROM t0 HAVING t0.c0 = 1;  -- result: {1|0}; time: 0.124 sec

EXPLAIN SELECT * FROM t0 HAVING t0.c0 = 1 \G
*************************** 1. row ***************************
EXPLAIN: -> Filter: (t0.c0 = 0)  (cost=100290 rows=998968)
    -> Table scan on t0  (cost=100290 rows=998968)

Suggested fix:
If a SELECT contains a HAVING clause but no GROUP BY clause and no aggregate functions in the HAVING condition, move all HAVING predicates to the WHERE clause. This is a safe, semantically correct transformation (SQL standard permits it; the result is unchanged). Once moved, the existing index selection logic will automatically choose the optimal access method (e.g., primary key lookup).
[18 May 7:19] Chaithra Marsur Gopala Reddy
Hi jinhui lai,

Thank you for the test case. Verified as described.
[18 May 10:26] Roy Lyseng
This is a minor issue because an observant developer should place all applicable filter predicates in the WHERE clause. The HAVING clause should only contain predicates that apply to aggregation performed as part of the grouping.