Bug #111060 Subquery ORDER BY is ignored
Submitted: 17 May 2023 15:57 Modified: 24 May 2023 19:08
Reporter: Steve Spencer Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[17 May 2023 15:57] Steve Spencer
Description:
I have 3 levels of subquery.

Outer selects 2 columns
Next tier is ordering by 2 other columns
Innermost is an opaque complex subquery.

The ordering is lost. Executing the next tier standalone returns the correct order.

Regression noticed on a GCE cloud VM on 10 May, and thus it is expected to arise from automatic installation of 8.0.33 sometime after it was released a couple of weeks before.

Outer results actual:
+--------+------------+
| c1r__a | c1s__b__5_ |
+--------+------------+
| NULL   |       NULL |
| Cat1   |        0.0 |
| Cat2   |        0.0 |
| Cat2   |        5.0 |
| Cat1   |        5.0 |
+--------+------------+

Outer results expected:
+--------+------------+
| c1r__a | c1s__b__5_ |
+--------+------------+
| Cat1   |        0.0 |
| Cat1   |        5.0 |
| Cat2   |        0.0 |
| Cat2   |        5.0 |
| NULL   |       NULL |
+--------+------------+

Next tier results (if executed standalone, showing expected ordering):
+--------+------------+------------------------------+------------------------------+
| c1m__a | c1n__b__5_ | c1o___if__0_axis_0_sort_0_1_ | c1p___if__1_axis_0_sort_1_2_ |
+--------+------------+------------------------------+------------------------------+
| Cat1   |        0.0 |                            5 |                            3 |
| Cat1   |        5.0 |                            5 |                            2 |
| Cat2   |        0.0 |                            4 |                            2 |
| Cat2   |        5.0 |                            4 |                            2 |
| NULL   |       NULL |                            1 |                            1 |
+--------+------------+------------------------------+------------------------------+

How to repeat:
CREATE TABLE `somebug` (`A` longtext, `B` bigint DEFAULT NULL, `C` double DEFAULT NULL, `D` datetime DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

INSERT INTO `somebug` VALUES (NULL,NULL,NULL,NULL),('Cat1',0,0.5,'2013-06-10 11:10:10'),('Cat2',1,1.5,'2013-06-11 12:11:11'),('Cat1',2,2.5,'2013-06-12 13:12:12'),('Cat2',3,3.5,'2013-06-13 14:13:13'),('Cat1',4,4.5,'2013-06-14 15:14:14'),('Cat2',5,5.5,'2013-06-15 16:15:15'),('Cat1',6,6.5,'2013-06-16 17:16:16'),('Cat2',7,7.5,'2013-06-17 18:17:17'),('Cat1',8,8.5,'2013-06-18 19:18:18');

Here is the full query:
select `c1m__a` as `c1r__a`, `c1n__b__5_` as `c1s__b__5_` from (
    select `c1g__a` as `c1m__a`, `c1h__b__5_` as `c1n__b__5_`, `c1i_axis_0_sort_0_1` as `c1o___if__0_axis_0_sort_0_1_`, `c1j_axis_0_sort_1_2` as `c1p___if__1_axis_0_sort_1_2_` from (
        select `t1k`.`c1b__a` as `c1g__a`, `t1k`.`c1c__b__5_` as `c1h__b__5_`, `t1k`.`c1d_axis_0_sort_0_1` as `c1i_axis_0_sort_0_1`, `t1l`.`c15_axis_0_sort_1_2` as `c1j_axis_0_sort_1_2` from (select `t1e`.`c19__a` as `c1b__a`, `t1e`.`c1a__b__5_` as `c1c__b__5_`, `t1f`.`cz_axis_0_sort_0_1` as `c1d_axis_0_sort_0_1` from (select `t18`.`c16_a` as `c19__a`, `t18`.`c17_h__b__5_` as `c1a__b__5_` from (select `A` as `c16_a`, (floor((`B` / 5.0)) * 5.0) as `c17_h__b__5_` from `somebug` as `inneralias`) as `t18` group by `t18`.`c16_a`, `t18`.`c17_h__b__5_`) as `t1e` left outer join (select `tx`.`A` as `cy__a`, count(*) as `cz_axis_0_sort_0_1` from (select `inneralias`.`A`, `inneralias`.`B`, `inneralias`.`C`, `inneralias`.`D` from `unitTestDontDelete` as `inneralias`) as `tx` group by `tx`.`A`) as `t1f` on (`t1e`.`c19__a` = `t1f`.`cy__a` or (`t1e`.`c19__a` is null and `t1f`.`cy__a` is null))) as `t1k` left outer join (select `t12`.`c10_a` as `c13__a`, `t12`.`c11_h__b__5_` as `c14__b__5_`, count(*) as `c15_axis_0_sort_1_2` from (select `A` as `c10_a`, (floor((`B` / 5.0)) * 5.0) as `c11_h__b__5_` from `unitTestDontDelete` as `inneralias`) as `t12` group by `t12`.`c10_a`, `t12`.`c11_h__b__5_`) as `t1l` on ((`t1k`.`c1b__a` = `t1l`.`c13__a` or (`t1k`.`c1b__a` is null and `t1l`.`c13__a` is null)) and (`t1k`.`c1c__b__5_` = `t1l`.`c14__b__5_` or (`t1k`.`c1c__b__5_` is null and `t1l`.`c14__b__5_` is null)))    
    ) as `t1q` order by `c1o___if__0_axis_0_sort_0_1_` desc, `c1m__a` asc, `c1p___if__1_axis_0_sort_1_2_` desc, `c1n__b__5_` asc
) as `t1t`;

It breaks down as...

Outer query (remove this to see the correct ordering):

select `c1m__a` as `c1r__a`, `c1n__b__5_` as `c1s__b__5_` from (
    ...
) as `t1t`

Next tier (has the ordering that is being lost):
select `c1g__a` as `c1m__a`, `c1h__b__5_` as `c1n__b__5_`, `c1i_axis_0_sort_0_1` as `c1o___if__0_axis_0_sort_0_1_`, `c1j_axis_0_sort_1_2` as `c1p___if__1_axis_0_sort_1_2_` from (
    ...
) as `t1q` order by `c1o___if__0_axis_0_sort_0_1_` desc, `c1m__a` asc, `c1p___if__1_axis_0_sort_1_2_` desc, `c1n__b__5_` asc

Inner complex subquery (I am considering this as opaque, but it is playing a part):

select `t1k`.`c1b__a` as `c1g__a`, `t1k`.`c1c__b__5_` as `c1h__b__5_`, `t1k`.`c1d_axis_0_sort_0_1` as `c1i_axis_0_sort_0_1`, `t1l`.`c15_axis_0_sort_1_2` as `c1j_axis_0_sort_1_2` from (select `t1e`.`c19__a` as `c1b__a`, `t1e`.`c1a__b__5_` as `c1c__b__5_`, `t1f`.`cz_axis_0_sort_0_1` as `c1d_axis_0_sort_0_1` from (select `t18`.`c16_a` as `c19__a`, `t18`.`c17_h__b__5_` as `c1a__b__5_` from (select `A` as `c16_a`, (floor((`B` / 5.0)) * 5.0) as `c17_h__b__5_` from `somebug` as `inneralias`) as `t18` group by `t18`.`c16_a`, `t18`.`c17_h__b__5_`) as `t1e` left outer join (select `tx`.`A` as `cy__a`, count(*) as `cz_axis_0_sort_0_1` from (select `inneralias`.`A`, `inneralias`.`B`, `inneralias`.`C`, `inneralias`.`D` from `unitTestDontDelete` as `inneralias`) as `tx` group by `tx`.`A`) as `t1f` on (`t1e`.`c19__a` = `t1f`.`cy__a` or (`t1e`.`c19__a` is null and `t1f`.`cy__a` is null))) as `t1k` left outer join (select `t12`.`c10_a` as `c13__a`, `t12`.`c11_h__b__5_` as `c14__b__5_`, count(*) as `c15_axis_0_sort_1_2` from (select `A` as `c10_a`, (floor((`B` / 5.0)) * 5.0) as `c11_h__b__5_` from `unitTestDontDelete` as `inneralias`) as `t12` group by `t12`.`c10_a`, `t12`.`c11_h__b__5_`) as `t1l` on ((`t1k`.`c1b__a` = `t1l`.`c13__a` or (`t1k`.`c1b__a` is null and `t1l`.`c13__a` is null)) and (`t1k`.`c1c__b__5_` = `t1l`.`c14__b__5_` or (`t1k`.`c1c__b__5_` is null and `t1l`.`c14__b__5_` is null)))    

Suggested fix:
I don't have a precise workaround, but messing with the innermost query can make it behave correctly.
[18 May 2023 12:16] MySQL Verification Team
Hi Mr. Spencer,

Thank you for your bug report.

However, we can not repeat your test case.

First of all, there is a table named "unitTestDontDelete", which we do not know how to make.

Second, sorting of the results of the nested query makes sense at all. Unless it is accompanied by LIMIT and then only in some versions of the nested query. Hence, explain to us why should we sort nested queries and how would that change the result ???

Can't repeat.
[19 May 2023 8:00] Steve Spencer
Apologies, I made a mistake when I was preparing the report.

I've simplified it further and fixed the reference to another table. Here's the full updated test case.

CREATE TABLE `somebug` (`A` longtext, `B` bigint DEFAULT NULL, `C` double DEFAULT NULL, `D` datetime DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

INSERT INTO `somebug` VALUES (NULL,NULL,NULL,NULL),('Cat1',0,0.5,'2013-06-10 11:10:10'),('Cat2',1,1.5,'2013-06-11 12:11:11'),('Cat1',2,2.5,'2013-06-12 13:12:12'),('Cat2',3,3.5,'2013-06-13 14:13:13'),('Cat1',4,4.5,'2013-06-14 15:14:14'),('Cat2',5,5.5,'2013-06-15 16:15:15'),('Cat1',6,6.5,'2013-06-16 17:16:16'),('Cat2',7,7.5,'2013-06-17 18:17:17'),('Cat1',8,8.5,'2013-06-18 19:18:18');

SELECT nextcol1 AS outercol1, nextcol2 AS outercol2 FROM (
    SELECT c1ga AS nextcol1, c1hb5 AS nextcol2, c1iaxis0sort01 AS nextcol3, c1jaxis0sort12 AS nextcol4 FROM (
        SELECT p.c1ba AS c1ga, p.c1cb5 AS c1hb5, p.c1daxis0sort01 AS c1iaxis0sort01, t1l.c15axis0sort12 AS c1jaxis0sort12 FROM (SELECT t1e.c19a AS c1ba, t1e.c1ab5 AS c1cb5, t1f.czaxis0sort01 AS c1daxis0sort01 FROM (SELECT t18.c16a AS c19a, t18.c17hb5 AS c1ab5 FROM (SELECT A AS c16a, (FLOOR((B / 5.0)) * 5.0) AS c17hb5 FROM somebug AS z) AS t18 GROUP BY t18.c16a, t18.c17hb5) AS t1e LEFT OUTER JOIN (SELECT tx.A AS cya, count(*) AS czaxis0sort01 FROM (SELECT z.A, z.B, z.C, z.D FROM somebug AS z) AS tx GROUP BY tx.A) AS t1f on (t1e.c19a = t1f.cya OR (t1e.c19a IS NULL AND t1f.cya IS NULL))) AS p LEFT OUTER JOIN (SELECT t12.c10a AS c13a, t12.c11hb5 AS c14b5, count(*) AS c15axis0sort12 FROM (SELECT A AS c10a, (FLOOR((B / 5.0)) * 5.0) AS c11hb5 FROM somebug AS z) AS t12 GROUP BY t12.c10a, t12.c11hb5) AS t1l on ((p.c1ba = t1l.c13a OR (p.c1ba IS NULL AND t1l.c13a IS NULL)) AND (p.c1cb5 = t1l.c14b5 OR (p.c1cb5 IS NULL AND t1l.c14b5 IS NULL)))    
    ) AS next ORDER BY nextcol3 desc, nextcol1 asc, nextcol4 desc, nextcol2 asc
) AS outermost;

Actual results:
+-----------+-----------+
| outercol1 | outercol2 |
+-----------+-----------+
| NULL      |      NULL |
| Cat1      |       0.0 |
| Cat2      |       0.0 |
| Cat2      |       5.0 |
| Cat1      |       5.0 |
+-----------+-----------+

Expected results:
+--------+------------+
| outercol1 | outercol2 |
+--------+------------+
| Cat1   |        0.0 |
| Cat1   |        5.0 |
| Cat2   |        0.0 |
| Cat2   |        5.0 |
| NULL   |       NULL |
+--------+------------+

Since the update to 8.0.33 the results changed from expected results to actual. Earlier versions of Mysql, and other database vendors, produce the expected results.

Strip the outer query (which is there only to drop 2 columns), and you get the correct order:

SELECT c1ga AS nextcol1, c1hb5 AS nextcol2, c1iaxis0sort01 AS nextcol3, c1jaxis0sort12 AS nextcol4 FROM (
    SELECT p.c1ba AS c1ga, p.c1cb5 AS c1hb5, p.c1daxis0sort01 AS c1iaxis0sort01, t1l.c15axis0sort12 AS c1jaxis0sort12 FROM (SELECT t1e.c19a AS c1ba, t1e.c1ab5 AS c1cb5, t1f.czaxis0sort01 AS c1daxis0sort01 FROM (SELECT t18.c16a AS c19a, t18.c17hb5 AS c1ab5 FROM (SELECT A AS c16a, (FLOOR((B / 5.0)) * 5.0) AS c17hb5 FROM somebug AS z) AS t18 GROUP BY t18.c16a, t18.c17hb5) AS t1e LEFT OUTER JOIN (SELECT tx.A AS cya, count(*) AS czaxis0sort01 FROM (SELECT z.A, z.B, z.C, z.D FROM somebug AS z) AS tx GROUP BY tx.A) AS t1f on (t1e.c19a = t1f.cya OR (t1e.c19a IS NULL AND t1f.cya IS NULL))) AS p LEFT OUTER JOIN (SELECT t12.c10a AS c13a, t12.c11hb5 AS c14b5, count(*) AS c15axis0sort12 FROM (SELECT A AS c10a, (FLOOR((B / 5.0)) * 5.0) AS c11hb5 FROM somebug AS z) AS t12 GROUP BY t12.c10a, t12.c11hb5) AS t1l on ((p.c1ba = t1l.c13a OR (p.c1ba IS NULL AND t1l.c13a IS NULL)) AND (p.c1cb5 = t1l.c14b5 OR (p.c1cb5 IS NULL AND t1l.c14b5 IS NULL)))    
) AS next ORDER BY nextcol3 desc, nextcol1 asc, nextcol4 desc, nextcol2 asc;

+----------+----------+----------+----------+
| nextcol1 | nextcol2 | nextcol3 | nextcol4 |
+----------+----------+----------+----------+
| Cat1     |      0.0 |        5 |        3 |
| Cat1     |      5.0 |        5 |        2 |
| Cat2     |      0.0 |        4 |        2 |
| Cat2     |      5.0 |        4 |        2 |
| NULL     |     NULL |        1 |        1 |
+----------+----------+----------+----------+

---

I don't understand your comment about sorting the results making no sense. The entire query is based on an auto generated one that builds up a query from another query language representing a chain of transformations. 

The penultimate step (the 2nd tier query) is to order the rows. The last step (the outermost query) is to a schema manipulation, to drop and rename columns, and I do not expect the order to be affected by this. Earlier steps (the 3rd and deeper queries) are an opaque query that produces an intermediate table which, for the purpose of validating this bug, does not need to be looked at (although clearly it has an influence on the behaviour). The SQL could indeed be simplified, but I am reporting a regression that came up in our own app tests where the only change was the mysql version.
[19 May 2023 12:39] MySQL Verification Team
Hi Mr. Spencer,

With a small rearrangement of the columns, we were able to repeat the behaviour that you reported.

These are our results:

+----------+-----------+----------+-----------+
| nextcol3 | outercol1 | nextcol4 | outercol2 |
+----------+-----------+----------+-----------+
|        1 | NULL      |        1 |      NULL |
|        5 | Cat1      |        3 |       0.0 |
|        4 | Cat2      |        2 |       0.0 |
|        4 | Cat2      |        2 |       5.0 |
|        5 | Cat1      |        2 |       5.0 |
+----------+-----------+----------+-----------+
+----------+----------+----------+----------+
| nextcol1 | nextcol2 | nextcol3 | nextcol4 |
+----------+----------+----------+----------+
| Cat1     |      0.0 |        5 |        3 |
| Cat1     |      5.0 |        5 |        2 |
| Cat2     |      0.0 |        4 |        2 |
| Cat2     |      5.0 |        4 |        2 |
| NULL     |     NULL |        1 |        1 |
+----------+----------+----------+----------+

The bug exists only in 8.0.33.

Verified as reported.
[23 May 2023 8:06] Chaithra Marsur Gopala Reddy
Posted by developer:
 
As a workaround you can set derived merge to off like this:

set optimizer_switch="derived_merge=off";
[25 May 2023 11:55] MySQL Verification Team
Thank you, Jon.
[8 Jun 2023 0:11] Jon Stephens
Documented fix as follows in the MySQL 8.0.34 and 8.2.0 changelogs:

    A fix in MySQL 8.0.33 made a change for ORDER BY items already
    resolved so as not to resolve them again (as is usually the case
    when a derived table is merged), but this did not handle the
    case in which an ORDER BY item was itself a reference.

Closed.