Bug #12887 | distinct vs rollup, are applied in inconsistent ways | ||
---|---|---|---|
Submitted: | 30 Aug 2005 16:39 | Modified: | 15 Sep 2005 19:35 |
Reporter: | Martin Friebe (Gold Quality Contributor) (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 4.1.14, 5.0.11, 4.1.15-BK | OS: | Linux (Linux, freebsd, Windows) |
Assigned to: | Evgeny Potemkin | CPU Architecture: | Any |
[30 Aug 2005 16:39]
Martin Friebe
[30 Aug 2005 16:46]
Martin Friebe
if you change the 2nd example (distinct after rollup) to a differnt order "group by b desc", the result set changes. select distinct a, max(b) from t1 group by b desc with rollup; +------+--------+ | a | max(b) | +------+--------+ | NULL | 2 | | a | 1 | | a | 2 | +------+--------+ 3 rows in set (0.00 sec)
[31 Aug 2005 9:30]
Valeriy Kravchuk
Thank you for your bug report. Verified both on 4.1.14 and 5.0.11. Yes, it is not clear form the socumentation when distinct shoud apply - before grouping or after. Looks like it is applied after... But, in any case the following: mysql> select distinct a, max(b) from t1 group by b with rollup; +------+--------+ | a | max(b) | +------+--------+ | a | 1 | | NULL | 2 | +------+--------+ 2 rows in set (0.00 sec) mysql> select distinct a, max(b) from t1 group by b desc with rollup; +------+--------+ | a | max(b) | +------+--------+ | NULL | 2 | | a | 1 | | a | 2 | +------+--------+ 3 rows in set (0.00 sec) mysql> select version(); +----------------+ | version() | +----------------+ | 5.0.11-beta-nt | +----------------+ 1 row in set (0.00 sec) is a bug.
[31 Aug 2005 9:40]
Valeriy Kravchuk
Verified on the latest 4.1.14-BK build on Linux also.
[14 Sep 2005 16:11]
Evgeny Potemkin
1) As mentioned in http://dev.mysql.com/doc/mysql/en/select.html "DISTINCT and DISTINCTROW are synonyms and specify that duplicate rows in the result set should be removed.". WITH ROLLUP is modifier of GROUP BY. Thus DISTINCT should be applied after ROLLUP. This is documented, but may be should be noted explicitly somewhere. 2) Example#1 works ok - duplicate produced by ROLLUP is removed. Example #2 - DISTINCT isn't applied - is the bug. Have to be fixed. 3) "select distinct a, max(b) from t1 group by 'X' with rollup;" - field 'a' treated by server as hidden field (http://dev.mysql.com/doc/mysql/en/group-by-hidden-fields.html) and it's result is undefined, because 'X' is constant grouping performed over whole table and with ROLLUP it produces result set ('a',2),('a',2). DISTINCTremoves duplicate rows and thus the final result set is ('a',2). So - everything is ok. 4) Changing "group by b" to "group by b desc" changes result set - here column 'a' is hidden and result of it is undefined in this case (documented in group-by-hidden-fields). Adding 'desc' changes actual value of undefined 'a' and thus result sets to which DISTINCT is being applied are differs. Because of this the queries produces different final result sets.
[14 Sep 2005 16:22]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/29855
[15 Sep 2005 18:07]
Evgeny Potemkin
For queries with GROUP BY and without hidden GROUP BY fields DISTINCT is optimized away becuase such queries produce result set without duplicates. But ROLLUP can add rows which may be same to some rows and this fact was ignored. Fixed in 4.1.15, cset 1.2433.2.1
[15 Sep 2005 19:25]
Evgeny Potemkin
Fixed in 5.0.14
[15 Sep 2005 19:35]
Paul DuBois
Noted in 4.1.15, 5.0.14 changelogs.