Bug #33087 DISTINCT with ORDER BY and LIMIT returns incorrect results
Submitted: 8 Dec 2007 17:22 Modified: 19 Dec 2007 21:02
Reporter: Jason Roth Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.0.45, 5.0.52, 5.1.22 OS:Any
Assigned to: Evgeny Potemkin CPU Architecture:Any
Tags: distinct, limit, orderby

[8 Dec 2007 17:22] Jason Roth
Description:
When selecting a distinct row, ordering by a different row, and limiting the result, mysql will "distinct" before ordering and limiting returning incorrect results.

How to repeat:
Consider a table of timestamped searches
|search|timestamp|
|foo   |000000001|
|bar   |000000002|
|you   |000000003|
|foo   |000000004|

The query to return the last 2 distinct searches is
SELECT DISTINCT `search`
FROM `searches`
ORDER BY `timestamp`
LIMIT 2

Which should return
|search|
|foo   |
|you   |

Instead it returns
|search|
|foo   |
|bar   |

since it eliminates duplicate rows before ordering and limiting.

Suggested fix:
A temporary workaround is to create a btree index on the column you intend to order by.  This will cause mysql to return the correct result.
[8 Dec 2007 18:01] MySQL Verification Team
Thank you for the bug report. Which exactly release version are you using?
if prior to 5.0.45 please try with this version otherwise please provide
the complete test script (create table plus insert commands). Thanks in
advance.
[8 Dec 2007 18:16] Jason Roth
I'm using v5.0.45.  Here is a test script

CREATE DATABASE `test`;
USE `test`;

CREATE TABLE `searches` (
   `search`       VARCHAR(45),
   `created_on`   TIMESTAMP
);

INSERT INTO `searches` VALUES ('foo', '2007-12-08 12:11:40');
INSERT INTO `searches` VALUES ('bar', '2007-12-08 12:11:41');
INSERT INTO `searches` VALUES ('you', '2007-12-08 12:11:42');
INSERT INTO `searches` VALUES ('foo', '2007-12-08 12:11:43');

/* returns an incorrect result */
SELECT DISTINCT `search`
FROM `searches`
ORDER BY `created_on` DESC
LIMIT 2;

/* add an index */
ALTER TABLE `searches` ADD INDEX `created_on_index` USING BTREE(`created_on`);

/* now it returns the correct result */
SELECT DISTINCT `search`
FROM `searches`
ORDER BY `created_on` DESC
LIMIT 2;
[8 Dec 2007 18:37] Sveta Smirnova
Thank you for the report.

Verified as described.
[9 Dec 2007 16:40] Peter Laursen
On my system (5.0.45, Win Vista32) the wrong result is another wrong result!

DROP table  if exists searches;

CREATE TABLE `searches` (
   `search`       VARCHAR(45),
   `created_on`   TIMESTAMP
);

INSERT INTO `searches` VALUES ('foo', '2007-12-08 12:11:40');
INSERT INTO `searches` VALUES ('bar', '2007-12-08 12:11:41');
INSERT INTO `searches` VALUES ('you', '2007-12-08 12:11:42');
INSERT INTO `searches` VALUES ('foo', '2007-12-08 12:11:43');

/* returns an incorrect result */
SELECT DISTINCT `search`
FROM `searches`
ORDER BY `created_on` DESC
LIMIT 2;
/*search
------
you   
bar  */

/* add an index */
ALTER TABLE `searches` ADD INDEX `created_on_index` USING BTREE(`created_on`);

/* now it returns the correct result */
SELECT DISTINCT `search`
FROM `searches`
ORDER BY `created_on` DESC
LIMIT 2;
/*search
------
foo   
you  */
[19 Dec 2007 19:33] Evgeny Potemkin
The correct result is:
+--------+
| search |
+--------+
| you    | 
| bar    | 
+--------+
The manual says that order by is evaluated last(except limit). So your query works as follows:
DISTINCT gathers foo/12:11:40, bar/12:11:41,you/12:11:42. When it tries to add foo/12:11:43 it sees that 'foo' is already there and skips it. After this ORDER BY sorts data set gathered by DISTINCT and then it is limited to 2 records.
To get latest records you should use this query:
SELECT DISTINCT `search` FROM (select * from `searches` ORDER BY `created_on` DESC) a LIMIT 2;
or:
SELECT search, max(created_on) FROM searches GROUP BY 1 ORDER BY 2 DESC LIMIT 2;
[19 Dec 2007 21:02] Jason Roth
I think the real issue is that adding an index causes mysql to return different results for the same query.
[20 Dec 2007 7:00] Evgeny Potemkin
Your query depends on the order of data in the table thus you get different results with and without index. To see that you can exchange 'foo' rows by inserting them in reversed order and you will see that your query will start to return the result you're expecting. Queries which I provided describe more exactly what should be done and in which order, thus they always return the result you're expect.
[31 Dec 2017 0:12] MySQL Verification Team
https://bugs.mysql.com/bug.php?id=88814 marked as duplicate of this ojne.