Bug #27333 subquery grouped for aggregate of outer query / no aggregate of subquery
Submitted: 21 Mar 2007 14:29 Modified: 11 Jul 2007 22:21
Reporter: Martin Friebe (Gold Quality Contributor) (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.0.40/5.1 OS:Any (*)
Assigned to: Georgi Kodinov CPU Architecture:Any
Tags: aggregate, subquery

[21 Mar 2007 14:29] Martin Friebe
Description:
From my understanding of the SQL2003 draft: (section 6.9)

The aggregation query is the innermost qualifying query, for the columns referenced in the aggregate function.

In the query below, the aggregation query for "count(b)" is the outer query ("from t1").

The inner query ("from t2") is not an aggregation query.

Therefore the inner query should return "count(b)" for each row in t2. If t2 has more than 1 row, the whole query should fail with "subquery returned more than one row"

select a, (select count(b) from t2) from t1 group by a;

If you add "+0" to the "count(b)" it is returning the correct error.

How to repeat:
drop table if exists t1,t2;
create table t1 (a int, b int);
create table t2 (x int);
insert into t1 values (1,11), (2,22), (2,22);
insert into t2 values (1), (2);

# doesn't fail, but should
select a, count(b), (select count(b) from t2) from t1 group by a;
select a, count(b), (select count(b+0) from t2) from t1 group by a;

# fails as it should
select a, count(b), (select count(b)+0 from t2) from t1 group by a;

Suggested fix:
IMHO the subquery should not group inthe above case.

More important it should behave the same in the 2 cases given
[21 Mar 2007 15:16] MySQL Verification Team
Thank you for the bug report. Verified as described.
[29 Mar 2007 12:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/23290

ChangeSet@1.2422, 2007-03-29 15:15:11+03:00, gkodinov@magare.gmz +7 -0
  Bug#27333:
   The optimizer counts the aggregate functions that 
   appear as top level expressions (in all_fields) in 
   the current subquery. Later it makes a list of these
   that it uses to actually execute the aggregates in
   end_send_group().
   That count is used in several places as a flag whether
   there are aggregates functions.
   While collection the above info it must not consider
   aggregates that are not aggregated in the current 
   context. It must treat them as normal expressions 
   instead. Not doing that leads to incorrect data about
   the query, e.g. running a query that actually has no
   aggregate functions as if it has them (and hence is
   expected to return only one row).
   Fixed by ignoring the aggregates that are not aggregated
   in the current context. 
   One other smaller omission discovered and fixed in the 
   process : the place of aggregation was not calculated for
   user defined functions. Fixed by calling 
   Item_sum::init_sum_func_check() and 
   Item_sum::check_sum_func() as it's done for the rest of 
   the aggregate functions.
[25 Apr 2007 13:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/25411

ChangeSet@1.2422, 2007-04-25 16:22:33+03:00, gkodinov@magare.gmz +7 -0
  Bug#27333:
   The optimizer counts the aggregate functions that 
   appear as top level expressions (in all_fields) in 
   the current subquery. Later it makes a list of these
   that it uses to actually execute the aggregates in
   end_send_group().
   That count is used in several places as a flag whether
   there are aggregates functions.
   While collecting the above info it must not consider
   aggregates that are not aggregated in the current 
   context. It must treat them as normal expressions 
   instead. Not doing that leads to incorrect data about
   the query, e.g. running a query that actually has no
   aggregate functions as if it has some (and hence is
   expected to return only one row).
   Fixed by ignoring the aggregates that are not aggregated
   in the current context. 
   One other smaller omission discovered and fixed in the 
   process : the place of aggregation was not calculated for
   user defined functions. Fixed by calling 
   Item_sum::init_sum_func_check() and 
   Item_sum::check_sum_func() as it's done for the rest of 
   the aggregate functions.
[18 May 2007 15:46] Anthony Willard
Here is another case that proves this bug.

drop table if exists t1;
create table t1( a int, b int );
insert into t1 values (1, 2), (1,3), (1,4), (2,1), (2,2);

-- returns no rows, when it should
select a1.a,
count(*)
from t1 a1
where a1.a = 1
and exists(
  select a2.a
  from t1 a2
  where a2.a = a1.a

)
group by a1.a;
[26 Jun 2007 14:53] Georgi Kodinov
Bug #29297 is a duplicate of this one
[28 Jun 2007 8:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29819

ChangeSet@1.2507, 2007-06-28 11:54:32+03:00, gkodinov@magare.gmz +7 -0
  Bug#27333: subquery grouped for aggregate of outer 
  query / no aggregate of subquery
   The optimizer counts the aggregate functions that 
   appear as top level expressions (in all_fields) in 
   the current subquery. Later it makes a list of these
   that it uses to actually execute the aggregates in
   end_send_group().
   That count is used in several places as a flag whether
   there are aggregates functions.
   While collecting the above info it must not consider
   aggregates that are not aggregated in the current 
   context. It must treat them as normal expressions 
   instead. Not doing that leads to incorrect data about
   the query, e.g. running a query that actually has no
   aggregate functions as if it has some (and hence is
   expected to return only one row).
   Fixed by ignoring the aggregates that are not aggregated
   in the current context. 
   One other smaller omission discovered and fixed in the 
   process : the place of aggregation was not calculated for
   user defined functions. Fixed by calling 
   Item_sum::init_sum_func_check() and 
   Item_sum::check_sum_func() as it's done for the rest of 
   the aggregate functions.
[29 Jun 2007 7:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29929

ChangeSet@1.2507, 2007-06-29 10:39:17+03:00, gkodinov@magare.gmz +7 -0
  Bug#27333: subquery grouped for aggregate of outer 
  query / no aggregate of subquery
   The optimizer counts the aggregate functions that 
   appear as top level expressions (in all_fields) in 
   the current subquery. Later it makes a list of these
   that it uses to actually execute the aggregates in
   end_send_group().
   That count is used in several places as a flag whether
   there are aggregates functions.
   While collecting the above info it must not consider
   aggregates that are not aggregated in the current 
   context. It must treat them as normal expressions 
   instead. Not doing that leads to incorrect data about
   the query, e.g. running a query that actually has no
   aggregate functions as if it has some (and hence is
   expected to return only one row).
   Fixed by ignoring the aggregates that are not aggregated
   in the current context. 
   One other smaller omission discovered and fixed in the 
   process : the place of aggregation was not calculated for
   user defined functions. Fixed by calling 
   Item_sum::init_sum_func_check() and 
   Item_sum::check_sum_func() as it's done for the rest of 
   the aggregate functions.
[1 Jul 2007 19:57] Bugs System
Pushed into 5.1.21-beta
[1 Jul 2007 20:01] Bugs System
Pushed into 5.0.46
[11 Jul 2007 22:21] Paul DuBois
Noted in 5.0.46, 5.1.21 changelogs.

Aggregations in subqueries that refer to outer query columns 
were not always correctly referenced to the proper outer query.