| Bug #103869 | Duplicate computation of aggregate function when referred in HAVING clause | ||
|---|---|---|---|
| Submitted: | 31 May 2021 14:29 | Modified: | 2 Jun 2021 17:08 |
| Reporter: | Øystein Grøvlen | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
| Version: | 8.0.24, 8.0.25 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[1 Jun 2021 12:39]
MySQL Verification Team
Hi Mr. Øystein Grøvlen, Thank you very much for your bug report. It does sound like a very interesting performance improvement report, but we must be able to repeat that behaviour, with 8.0.25. Can you be so kind and send us a repeatable test case ??? Thanks a lot in advance.
[1 Jun 2021 12:57]
Øystein Grøvlen
AFAICT, the only difference between 8.0.24 and 8.0.25 is
Bug#32717969: Wrong result of execute prepare on 'smallint'
Do you expect that to have fixed this issue?
[1 Jun 2021 13:13]
MySQL Verification Team
Hi, It is very hard to say, so a test case is truly important.
[1 Jun 2021 14:28]
Øystein Grøvlen
I can confirm that the issue also exists in 8.0.25
[2 Jun 2021 12:19]
MySQL Verification Team
Hi Mr. Grovlen, We truly believe you, but we do require a reproducible test case.
[2 Jun 2021 15:14]
Øystein Grøvlen
So the original test case in "How-to-repeat" is not not sufficient? EXPLAIN shows that the sum function is computed twice.
[2 Jun 2021 17:07]
Steinar Gunderson
The test case is sufficient.

Description: If an aggregate function is referred both in the SELECT list and in the HAVING clause, it is computed twice: mysql [localhost:8024] {msandbox} (world) > explain format=tree select countrycode, sum(population) from city group by countrycode having sum(population) > 10000000\G *************************** 1. row *************************** EXPLAIN: -> Filter: (sum(city.Population) > 10000000) -> Group aggregate: sum(city.Population), sum(city.Population) -> Index scan on city using CountryCode (cost=414.15 rows=4079) Note the line form "Group aggregate". This does not happen for ORDER BY: mysql [localhost:8024] {msandbox} (world) > explain format=tree select countrycode, sum(population) from city group by countrycode order by sum(population)\G *************************** 1. row *************************** EXPLAIN: -> Sort: `sum(population)` -> Stream results -> Group aggregate: sum(city.Population) -> Index scan on city using CountryCode (cost=414.15 rows=4079) For the following variant of TPC-H Q18 (scale factor 10), this leads to a 20% increase in execution time: SELECT c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, l2.qty FROM customer, orders, ( SELECT l_orderkey, SUM(l_quantity) qty FROM lineitem GROUP BY l_orderkey HAVING SUM(l_quantity) > 312 ) l2 WHERE o_orderkey = l2.l_orderkey AND c_custkey = o_custkey GROUP BY c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice ORDER BY o_totalprice DESC, o_orderdate LIMIT 100; compared to SELECT c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, l2.qty FROM customer, orders, ( SELECT l_orderkey, SUM(l_quantity) qty FROM lineitem GROUP BY l_orderkey HAVING qty > 312 ) l2 WHERE o_orderkey = l2.l_orderkey AND c_custkey = o_custkey GROUP BY c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice ORDER BY o_totalprice DESC, o_orderdate LIMIT 100; How to repeat: explain format=tree select countrycode, sum(population) from city group by countrycode having sum(population) > 10000000\G explain format=tree select countrycode, sum(population) from city group by countrycode order by sum(population)\G