Bug #46217 Wrong result on aggregate + JOIN + ORDER BY + LIMIT
Submitted: 16 Jul 2009 11:16
Reporter: Philip Stoev Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0,5.1,5.4 OS:Any
Assigned to: CPU Architecture:Any
Triage: Triaged: D2 (Serious)

[16 Jul 2009 11:16] Philip Stoev
Description:
Queries of the form:

SELECT  COUNT( OUTR .`datetime_key`  )
FROM C OUTR2  JOIN C OUTR  ON OUTR2 .`time_key`  <= OUTR .`datetime_nokey`
ORDER  BY OUTR .`pk`
LIMIT 1;

Return a smaller value than expected. Note that the ORDER BY and LIMIT are totally redundant given that this is an aggregate query with no GROUP BY. Removing the ORDER BY and/or the LIMIT causes the query plan to change and the result to become correct.

How to repeat:
--disable_warnings
DROP TABLE IF EXISTS C;
--enable_warnings

CREATE TABLE `C` (
  `pk` int(11) NOT NULL AUTO_INCREMENT,
  `time_key` time NOT NULL,
  `datetime_key` datetime NOT NULL,
  `datetime_nokey` datetime NOT NULL,
  PRIMARY KEY (`pk`),
  KEY `time_key` (`time_key`),
  KEY `datetime_key` (`datetime_key`)
) ENGINE=InnoDB AUTO_INCREMENT=21 DEFAULT CHARSET=latin1;
INSERT INTO `C` VALUES (1,'20:53:30','0000-00-00 00:00:00','0000-00-00 00:00:00'),(2,'00:00:00','2008-04-19 07:51:37','2008-04-19 07:51:37'),(3,'12:37:08','2006-06-03 00:00:00','2006-06-03 00:00:00'),(4,'00:00:00','0000-00-00 00:00:00','0000-00-00 00:00:00'),(5,'15:59:14','2000-10-03 15:17:43','2000-10-03 15:17:43'),(6,'04:43:51','2009-04-25 16:10:46','2009-04-25 16:10:46'),(7,'00:00:00','2005-01-11 03:31:23','2005-01-11 03:31:23'),(8,'09:51:25','0000-00-00 00:00:00','0000-00-00 00:00:00'),(9,'00:00:00','2000-03-07 00:00:00','2000-03-07 00:00:00'),(10,'00:00:00','2001-06-14 20:33:16','2001-06-14 20:33:16'),(11,'00:00:00','2005-03-06 05:45:38','2005-03-06 05:45:38'),(12,'18:50:55','0000-00-00 00:00:00','0000-00-00 00:00:00'),(13,'00:00:00','0000-00-00 00:00:00','0000-00-00 00:00:00'),(14,'08:22:36','2002-02-13 21:59:10','2002-02-13 21:59:10'),(15,'00:00:00','0000-00-00 00:00:00','0000-00-00 00:00:00'),(16,'09:37:21','0000-00-00 00:00:00','0000-00-00 00:00:00'),(17,'05:15:03','2001-12-23 00:00:00','2001-12-23 00:00:00'),(18,'20:11:43','2004-05-16 00:00:00','2004-05-16 00:00:00'),(19,'00:00:00','2004-04-02 00:00:00','2004-04-02 00:00:00'),(20,'00:00:00','2003-11-26 00:00:00','2003-11-26 00:00:00');

SELECT  COUNT( OUTR .`datetime_key`  )
FROM C OUTR2  JOIN C OUTR  ON OUTR2 .`time_key`  <= OUTR .`datetime_nokey`
ORDER  BY OUTR .`pk`
LIMIT 1;

Suggested fix:
The optimizer needs to prune ORDER BY and LIMIT clauses as soon as possible if the query is an aggregate with no GROUP BY. This will prevent various overzealous optimizations which are a gray area, and are known to harbor bugs.
[16 Jul 2013 10:55] Hartmut Holzgraefe
Still reproducible with 5.6.12, requires that the date_time values in the test case are moved into the future due to the change in how TIME<->DATETIME comparisons work (by now using the current date as DATE part for TIME) introduced with 5.6.4

This is also only reproducible with an index on the TIME column and with the 'right' join order:

DROP TABLE IF EXISTS C;

CREATE TABLE `C` (
  `pk` int(11) NOT NULL AUTO_INCREMENT,
  `time_key` time NOT NULL,
  `datetime_nokey` datetime NOT NULL,
  PRIMARY KEY (`pk`),
  KEY `time_key` (`time_key`)
);

INSERT INTO `C` VALUES ( 1,'15:59:14','2020-10-03 15:17:43');
INSERT INTO `C` VALUES ( 2,'04:43:51','2029-04-25 16:10:46');
INSERT INTO `C` VALUES ( 3,'08:22:36','2022-02-13 21:59:10');
INSERT INTO `C` VALUES ( 4,'09:22:36','2022-02-14 21:59:10');

-- see that we get 16 result rows 
SELECT  OUTR .`datetime_nokey`  
FROM C OUTR2  JOIN C OUTR  
ON OUTR2 .`time_key`  <= OUTR .`datetime_nokey`
;

-- now use count -> still 16
SELECT  COUNT( OUTR .`datetime_nokey`  )
FROM C OUTR2  JOIN C OUTR  
ON OUTR2 .`time_key`  <= OUTR .`datetime_nokey`
;

-- now add redundant ORDER BY and LIMIT -> 15
SELECT  COUNT( OUTR .`datetime_nokey`  )
FROM C OUTR2  JOIN  C OUTR  
ON OUTR2 .`time_key`  <= OUTR .`datetime_nokey`
ORDER  BY OUTR .`pk`
LIMIT 1
;

-- the problem is with the join order, if we 
-- force it either way we get either 15 or 16

SELECT  COUNT( OUTR .`datetime_nokey`  )
FROM C OUTR2  STRAIGHT_JOIN  C OUTR  
ON OUTR2 .`time_key`  <= OUTR .`datetime_nokey`
ORDER  BY OUTR .`pk`
LIMIT 1
;
SELECT  COUNT( OUTR .`datetime_nokey`  )
FROM C OUTR  STRAIGHT_JOIN  C OUTR2
ON OUTR2 .`time_key`  <= OUTR .`datetime_nokey`
ORDER  BY OUTR .`pk`
LIMIT 1
;

-- now lets drop the index on time_key

ALTER TABLE C DROP KEY time_key;

-- we will now get a count of 16 in both cases

SELECT  COUNT( OUTR .`datetime_nokey`  )
FROM C OUTR2  STRAIGHT_JOIN  C OUTR  
ON OUTR2 .`time_key`  <= OUTR .`datetime_nokey`
ORDER  BY OUTR .`pk`
LIMIT 1
;
SELECT  COUNT( OUTR .`datetime_nokey`  )
FROM C OUTR  STRAIGHT_JOIN  C OUTR2
ON OUTR2 .`time_key`  <= OUTR .`datetime_nokey`
ORDER  BY OUTR .`pk`
LIMIT 1
;
[16 Jul 2013 11:00] Hartmut Holzgraefe
Hmno, i minimized it too much, right now it works on 5.6.12 due to different execution plan chosen ... will redo the test case once more ...
[16 Jul 2013 11:10] Hartmut Holzgraefe
Ok, ignore me, it is reproducible on 5.1 and 5.5 but not on 5.6.12 anymore ... sorry for the noise ...