Bug #117348 Query with DISTINCT in combination with big_tables returns wrong sortorder
Submitted: 31 Jan 10:06 Modified: 31 Jan 13:52
Reporter: Kenny Johansson Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.4, 8.0.41, 8.4.4, 9.2.0 OS:Debian
Assigned to: CPU Architecture:x86

[31 Jan 10:06] Kenny Johansson
Description:
The following query returns the results in an unexpected order when the big_tables variable is ON.
It is also appears that both of the joined tables need to have a unique combined index for the bug to appear.

SELECT DISTINCT foo.id, foo.pubdate
FROM foo
INNER JOIN bar ON bar.foo_id = foo.id
WHERE pid = 1
ORDER BY foo.pubdate DESC;

Result:
+----+------------+
| id | pubdate    |
+----+------------+
|  1 | 2010-10-10 |
|  2 | 2010-10-11 |
+----+------------+

Expected result:
+----+------------+
| id | pubdate    |
+----+------------+
|  2 | 2010-10-11 |
|  1 | 2010-10-10 |
+----+------------+

How to repeat:
SET big_tables=ON;

CREATE TABLE `foo` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `pid` int unsigned NOT NULL,
  `pubdate` date DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `idx_pubdate` (`pid`,`pubdate`)
) ENGINE=InnoDB;

CREATE TABLE `bar` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `foo_id` int unsigned NOT NULL,
  `version` int NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `idx_version` (`foo_id`,`version`)
) ENGINE=InnoDB;

INSERT INTO foo (id, pid, pubdate) values (1, 1, '2010-10-10');
INSERT INTO foo (id, pid, pubdate) values (2, 1, '2010-10-11');
INSERT INTO bar (id, foo_id, version) VALUES (1, 1, 1);
INSERT INTO bar (id, foo_id, version) VALUES (2, 2, 2);

SELECT DISTINCT foo.id, foo.pubdate
FROM foo
INNER JOIN bar ON bar.foo_id = foo.id
WHERE pid = 1
ORDER BY foo.pubdate DESC;

Suggested fix:
Workarounds are possible, either:

* Turn OFF big_tables
* Use GROUP BY instead of DISTINCT
[31 Jan 13:52] MySQL Verification Team
Hello Kenny Johansson,

Thank you for the report and test case.

regards,
Umesh