Description:
When performing the same aggregate operation (AVG) in a main query and a correlated subquery, then comparing the results, the comparison outcome is inconsistent with the expected logic. This inconsistency occurs because:
The aggregate result from the subquery (avg_price(2)) is stored in Item_field with reduced precision (e.g., 458.333333000 for Electronics category)
The aggregate result from the main query (avg_price(1)) is stored in Item_avg_field with higher precision (e.g., 458.333333333 for Electronics category)
The minor precision difference causes the equality comparison (avg_price(1) = avg_price(2)) to return false incorrectly, filtering out valid results.
This issue does not exist in version 9.x, but it is not a direct fix—version 9.x modified Item::split_sum_func2() to execute the subquery directly during result comparison (instead of precomputing avg_price(2) in the main query's aggregation phase), which bypassed the precision loss issue.
How to repeat:
DROP TABLE IF EXISTS test_having_alias;
CREATE TABLE test_having_alias (
id INT PRIMARY KEY,
category VARCHAR(50),
price DECIMAL(10,2)
);
INSERT INTO test_having_alias VALUES
(1, 'Electronics', 1500.00),
(2, 'Electronics', 50.00),
(3, 'Furniture', 300.00),
(4, 'Furniture', 800.00),
(5, 'Electronics', 80.00),
(6, 'Electronics', 400.00),
(7, 'Furniture', 200.00),
(8, 'Electronics', 600.00),
(9, 'Furniture', 1200.00),
(10, 'Electronics', 120.00);
-- Execute the query with correlated subquery comparison:
SELECT
category AS cat,
AVG(price) AS avg_price
FROM test_having_alias
GROUP BY cat
HAVING avg_price = (
SELECT AVG(price)
FROM test_having_alias
WHERE category = cat
)
ORDER BY cat ;
Observe the result:
Actual Result: Only 1 row (Furniture) is returned
plaintext
+-----------+------------+
| cat | avg_price |
+-----------+------------+
| Furniture | 625.000000 |
+-----------+------------+
Expected Result: 2 rows (Electronics + Furniture) (verified by running the base aggregate query without subquery comparison):
SELECT
category AS cat,
AVG(price) AS avg_price
FROM test_having_alias
GROUP BY cat;
+-------------+------------+
| cat | avg_price |
+-------------+------------+
| Electronics | 458.333333 |
| Furniture | 625.000000 |
+-------------+------------+
Suggested fix:
Modify the Arg_comparator::compare_decimal() function to compare decimal values only within the range of valid precision (ignore trailing zero differences beyond the effective precision of the decimal type) when performing equality checks on decimal numbers. This ensures that minor precision differences from different storage types (Item_field vs Item_avg_field) do not invalidate logically equal aggregate results.
Description: When performing the same aggregate operation (AVG) in a main query and a correlated subquery, then comparing the results, the comparison outcome is inconsistent with the expected logic. This inconsistency occurs because: The aggregate result from the subquery (avg_price(2)) is stored in Item_field with reduced precision (e.g., 458.333333000 for Electronics category) The aggregate result from the main query (avg_price(1)) is stored in Item_avg_field with higher precision (e.g., 458.333333333 for Electronics category) The minor precision difference causes the equality comparison (avg_price(1) = avg_price(2)) to return false incorrectly, filtering out valid results. This issue does not exist in version 9.x, but it is not a direct fix—version 9.x modified Item::split_sum_func2() to execute the subquery directly during result comparison (instead of precomputing avg_price(2) in the main query's aggregation phase), which bypassed the precision loss issue. How to repeat: DROP TABLE IF EXISTS test_having_alias; CREATE TABLE test_having_alias ( id INT PRIMARY KEY, category VARCHAR(50), price DECIMAL(10,2) ); INSERT INTO test_having_alias VALUES (1, 'Electronics', 1500.00), (2, 'Electronics', 50.00), (3, 'Furniture', 300.00), (4, 'Furniture', 800.00), (5, 'Electronics', 80.00), (6, 'Electronics', 400.00), (7, 'Furniture', 200.00), (8, 'Electronics', 600.00), (9, 'Furniture', 1200.00), (10, 'Electronics', 120.00); -- Execute the query with correlated subquery comparison: SELECT category AS cat, AVG(price) AS avg_price FROM test_having_alias GROUP BY cat HAVING avg_price = ( SELECT AVG(price) FROM test_having_alias WHERE category = cat ) ORDER BY cat ; Observe the result: Actual Result: Only 1 row (Furniture) is returned plaintext +-----------+------------+ | cat | avg_price | +-----------+------------+ | Furniture | 625.000000 | +-----------+------------+ Expected Result: 2 rows (Electronics + Furniture) (verified by running the base aggregate query without subquery comparison): SELECT category AS cat, AVG(price) AS avg_price FROM test_having_alias GROUP BY cat; +-------------+------------+ | cat | avg_price | +-------------+------------+ | Electronics | 458.333333 | | Furniture | 625.000000 | +-------------+------------+ Suggested fix: Modify the Arg_comparator::compare_decimal() function to compare decimal values only within the range of valid precision (ignore trailing zero differences beyond the effective precision of the decimal type) when performing equality checks on decimal numbers. This ensures that minor precision differences from different storage types (Item_field vs Item_avg_field) do not invalidate logically equal aggregate results.