Bug #49360 | In some circumstances, GROUP BY WITH ROLLUP is faster then GROUP BY | ||
---|---|---|---|
Submitted: | 2 Dec 2009 17:53 | Modified: | 3 Apr 2011 0:21 |
Reporter: | Rene' Cannao' | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.0.66 , 5.0.84 , 5.0.84 , 5.1.41, 5.1.43 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[2 Dec 2009 17:53]
Rene' Cannao'
[2 Dec 2009 17:57]
Rene' Cannao'
Test case and some output
Attachment: test_case_49360.txt (text/plain), 8.44 KiB.
[3 Dec 2009 5:24]
Valeriy Kravchuk
I can confirm these results with recent 5.1.42 from bzr. Also, the speed of GROUP without ROLLUP is the same as of GROUP BY ... ORDER BY NULL (although filesort is NOT present in EXPLAIN results for it, unlike the plans for original two queries).
[4 Dec 2009 19:09]
Valeriy Kravchuk
With 5.1.43-debug on Mac OS X I've got: mysql> SELECT SQL_NO_CACHE COUNT(*) FROM tbl1 , tbl2 WHERE tbl1.str1 = tbl2.str1 GROUP BY -> tbl2.str3, tbl1.str2 WITH ROLLUP HAVING str2 IS NOT NULL AND str3 IS NOT NULL; ... 540000 rows in set (1 min 55.51 sec) mysql> explain SELECT SQL_NO_CACHE COUNT(*) FROM tbl1 , tbl2 WHERE tbl1.str1 = tbl2.str1 GROUP BY tbl2.str3, tbl1.str2 WITH ROLLUP HAVING str2 IS NOT NULL AND str3 IS NOT NULL; +----+-------------+-------+-------+---------------+------+---------+----------------+---------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+------+---------+----------------+---------+----------------------------------------------+ | 1 | SIMPLE | tbl1 | index | str1 | str1 | 26 | NULL | 1048576 | Using index; Using temporary; Using filesort | | 1 | SIMPLE | tbl2 | ref | str1 | str1 | 13 | test.tbl1.str1 | 4 | Using index | +----+-------------+-------+-------+---------------+------+---------+----------------+---------+----------------------------------------------+ 2 rows in set (0.00 sec) mysql> SELECT SQL_NO_CACHE COUNT(*) FROM tbl1 , tbl2 WHERE tbl1.str1 = tbl2.str1 GROUP BY -> tbl2.str3, tbl1.str2 HAVING str2 IS NOT NULL AND str3 IS NOT NULL; ... 540000 rows in set (4 min 4.71 sec) mysql> explain SELECT SQL_NO_CACHE COUNT(*) FROM tbl1 , tbl2 WHERE tbl1.str1 = tbl2.str1 GROUP BY tbl2.str3, tbl1.str2 HAVING str2 IS NOT NULL AND str3 IS NOT NULL; +----+-------------+-------+-------+---------------+------+---------+----------------+---------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+------+---------+----------------+---------+----------------------------------------------+ | 1 | SIMPLE | tbl1 | index | str1 | str1 | 26 | NULL | 1048576 | Using index; Using temporary; Using filesort | | 1 | SIMPLE | tbl2 | ref | str1 | str1 | 13 | test.tbl1.str1 | 4 | Using index | +----+-------------+-------+-------+---------------+------+---------+----------------+---------+----------------------------------------------+ 2 rows in set (0.00 sec) mysql> SELECT SQL_NO_CACHE COUNT(*) FROM tbl1 , tbl2 WHERE tbl1.str1 = tbl2.str1 GROUP BY tbl2.str3, tbl1.str2 HAVING str2 IS NOT NULL AND str3 IS NOT NULL ORDER BY NULL; ... 540000 rows in set (4 min 1.79 sec) mysql> explain SELECT SQL_NO_CACHE COUNT(*) FROM tbl1 , tbl2 WHERE tbl1.str1 = tbl2.str1 GROUP BY tbl2.str3, tbl1.str2 HAVING str2 IS NOT NULL AND str3 IS NOT NULL ORDER BY NULL; +----+-------------+-------+-------+---------------+------+---------+----------------+---------+------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+------+---------+----------------+---------+------------------------------+ | 1 | SIMPLE | tbl1 | index | str1 | str1 | 26 | NULL | 1048576 | Using index; Using temporary | | 1 | SIMPLE | tbl2 | ref | str1 | str1 | 13 | test.tbl1.str1 | 4 | Using index | +----+-------------+-------+-------+---------------+------+---------+----------------+---------+------------------------------+ 2 rows in set (0.00 sec)
[4 Dec 2009 19:32]
MySQL Verification Team
The mystery of faster ROLLUP is mostly resolved. The principal cause is that, when you have ROLLUP, temporary table is written sequentially, with no updates and without any searches of the grouped by column / expression values. The, all rows are read sequentially and grouped by on the fly. Without ROLLUP, aggregate queries have to assert distinctiveness and have to work with all summing functions, including MIN / MAX. Sometimes first approach is faster and sometimes second is faster, but GROUP BY, without WITH ROLLUP, has to maintain distinct values and work with all functions. It would be possible to create a case where aggregate queries with ROLLUP would perform worse.
[18 Dec 2009 19:42]
MySQL Verification Team
Some additional thoughts from me. The algorithm applied in the ROLLUP variant would be MUCH slower if there is a larger number of the average number of rows with same values of grouped by columns. Making optimization here seems to be very hard, as getting this average number would require the presence of one or several of suitable indices ... Even then, many other factors would have to be taken into account, like row width and maximum size for temporary MEMORY tables.
[4 Jan 2011 8:42]
MySQL Verification Team
show profiles all output for both queries
Attachment: bug49360_5.5.9_show_profiles.txt (text/plain), 18.16 KiB.
[29 Jun 2011 16:12]
MySQL Verification Team
the slower query spends all the time writing/updating records to disk. attached, cpu profiles.
Attachment: bug49360_5.5.13_cpu_hotspot_profile.txt (text/plain), 33.27 KiB.
[30 Oct 2011 5:59]
MySQL Verification Team
google perf tools for group by.
Attachment: 5.5.17_group_by.pdf (application/pdf, text), 22.74 KiB.
[30 Oct 2011 5:59]
MySQL Verification Team
google perf tools for group by with rollup.
Attachment: 5.5.17_group_by_with_rollup.pdf (application/pdf, text), 27.54 KiB.