MySQL Bugs: #46518: Removing ORDER BY from aggregate query changes result set (produces extra rows)

Bug #46518	Removing ORDER BY from aggregate query changes result set (produces extra rows)
Submitted:	2 Aug 2009 22:02	Modified:	25 Sep 2009 0:26
Reporter:	Patrick Crews	Email Updates:
Status:	Duplicate	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	5.0.85, 5.1, 6.0	OS:	Any
Assigned to:	Assigned Account	CPU Architecture:	Any
Tags:	Aggregate Function, azalea, order by, regression

Description:
Removing the ORDER BY clause from a simple query using aggregates drastically changes the result set.
With the ORDER BY:

SELECT  MAX( `pk`  ) field2  
FROM D  
ORDER  BY field2  , `pk`   ; 

produces 100 rows.

Without the ORDER BY:
SELECT  MAX( `pk`  ) field2  
FROM D  
;

Produces a single row

How to repeat:
/*!50400 SET SESSION optimizer_switch = 'index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on' */;
/*!50400 SET SESSION optimizer_use_mrr = '' */;
/*!50400 SET SESSION engine_condition_pushdown = 'ON' */;
/*!50400 SET SESSION join_cache_level = '' */;

# Begin test case for query 0

--disable_warnings
DROP TABLE IF EXISTS D;
--enable_warnings

CREATE TABLE `D` (
  `pk` int(11) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`pk`)
) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=latin1;
INSERT INTO `D` VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41),(42),(43),(44),(45),(46),(47),(48),(49),(50),(51),(52),(53),(54),(55),(56),(57),(58),(59),(60),(61),(62),(63),(64),(65),(66),(67),(68),(69),(70),(71),(72),(73),(74),(75),(76),(77),(78),(79),(80),(81),(82),(83),(84),(85),(86),(87),(88),(89),(90),(91),(92),(93),(94),(95),(96),(97),(98),(99),(100);

 
SELECT  MAX( `pk`  ) field2  
FROM D  
ORDER  BY field2  , `pk`   ;

DROP TABLE D;

# End of test case for query 0

# Begin test case for query 1

--disable_warnings
DROP TABLE IF EXISTS D;
--enable_warnings

CREATE TABLE `D` (
  `pk` int(11) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`pk`)
) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=latin1;
INSERT INTO `D` VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41),(42),(43),(44),(45),(46),(47),(48),(49),(50),(51),(52),(53),(54),(55),(56),(57),(58),(59),(60),(61),(62),(63),(64),(65),(66),(67),(68),(69),(70),(71),(72),(73),(74),(75),(76),(77),(78),(79),(80),(81),(82),(83),(84),(85),(86),(87),(88),(89),(90),(91),(92),(93),(94),(95),(96),(97),(98),(99),(100);

 
SELECT  MAX( `pk`  ) field2  
FROM D   /* TRANSFORM_OUTCOME_UNORDERED_MATCH */ ;

DROP TABLE D;

# End of test case for query 1

Suggested fix:
Ensure that equivalent queries produce consistent result sets.

As described in BUG#46328, the correct behavior is to return 1 record when a group function is used without a group by clause.

This may be a duplicate of BUG#46328

Bug is also reproducible in 5.1

SELECT  count( `pk`  ) field2  
FROM D  
ORDER  BY field2  , `pk`   ;

and 

SELECT  count( * ) field2  
...

also returns 100 records (which is wrong), while 

SELECT sum( 'pk' ) field2
...

returns a single record (which is correct)

Queries that contain aggregate functions but does not have a GROUP BY clause shall return one (or zero) record only (http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html)

Furthermore, for non-aggregate fields "...the server is free to return any value from the group, so the results are indeterminate unless all values are the same." (http://dev.mysql.com/doc/refman/5.1/en/group-by-hidden-columns.html)

Hence, an ORDER BY clause has no value if
  1) The query contains one or more aggregate functions, and
  2) The query does not contain a GROUP BY clause.

Suggested fix for this bug: optimize away ORDER BY for queries satisfying the above conditions.

Duplicate of bug#47280

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/84648

3133 Jorgen Loland	2009-09-25
      Bug#47280 - strange results from count(*) with order by multiple 
                  columns without where/group
      Bug#46518 - Removing ORDER BY from aggregate query changes result 
                  set (produces extra rows)
      
      Simple SELECT with implicit grouping returns many rows if the 
      ORDER BY clause contains the aggregate function from the SELECT
      list. Queries with implicit grouping should only return a single
      record.
      
      Function opt_sum_query is called if a query performs grouping but
      has no fields in the group list. The function tries to substitute
      count(), min() and max() aggregate functions with constants. In
      the queries that failed, opt_sum_query found a non-aggregate
      field and gave up the optimization. 
      
      However, non-aggregate fields can be substituted with a const in
      this case: When the select list contains non-aggregate fields not
      in the group list, the server is free to return any value from
      the group, so the value in such fields is indeterminate unless
      all values are the same.
     @ mysql-test/r/func_group.result
        Test for BUG#47280
     @ mysql-test/t/func_group.test
        Test for BUG#47280
     @ sql/opt_sum.cc
        Function opt_sum_func gave up const optimization of min/max/count if a non-aggregate field was encountered. This was incorrect since this optimization is only done if the group list is empty (hence, only zero or one row is returned), and the server is free to choose any of the field values in the group.
     @ sql/sql_select.cc
        Improved comment for why min/max/count can be optimize away