Bug #103873 Wrong result when order by on a window function which not exist in select fields
Submitted: 1 Jun 2021 10:42 Modified: 8 Jun 2021 9:47
Reporter: zetao wei Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: DML Severity:S1 (Critical)
Version:8.0.24 OS:CentOS
Assigned to: CPU Architecture:Any

[1 Jun 2021 10:42] zetao wei
Description:
Producer wrong results when order by on a window function which not exist in select fields.

mysql> SELECT 1+SUM(t1.i) OVER W FROM t1 join t2 on t1.i=t2.a  WINDOW w AS (PARTITION BY t1.j ORDER BY t1.i)   ORDER BY SUM(t1.i) OVER W;
+--------------------+
| 1+SUM(t1.i) OVER W |
+--------------------+
|                  2 |
|                  4 |
|                 12 |
|                 32 |
|                 64 |
|                 64 |
|                124 |
|                 92 |
|                 92 |
|                214 |
+--------------------+
10 rows in set (0.00 sec)

mysql> explain SELECT SUM(t1.i) OVER W FROM t1 join t2 on t1.i=t2.a  WINDOW w AS (PARTITION BY t1.j ORDER BY t1.i)   ORDER BY SUM(t1.i) OVER W;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                      |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    8 |   100.00 | Using temporary; Using filesort            |
|  1 | SIMPLE      | t2    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    8 |    12.50 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
2 rows in set, 2 warnings (0.01 sec)

mysql> explain format=tree SELECT SUM(t1.i) OVER W FROM t1 join t2 on t1.i=t2.a  WINDOW w AS (PARTITION BY t1.j ORDER BY t1.i)   ORDER BY SUM(t1.i) OVER W;
+--------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
+----------------------------------------------+
| -> Sort: <temporary>.tmp_field_2
    -> Table scan on <temporary>
        -> Temporary table
            -> Window aggregate with buffering
                -> Sort: <temporary>.j, <temporary>.i
                    -> Stream results
                        -> Inner hash join (t2.a = t1.i)  (cost=7.70 rows=8)
                            -> Table scan on t2  (cost=0.04 rows=8)
                            -> Hash
                                -> Table scan on t1  (cost=1.05 rows=8)
+----------------------------------------------+

I think the cause of this problem is the item of `SUM(t1.i) OVER W` which in order by does not move SUM items out from item tree and replace with reference. 

How to repeat:
CREATE TABLE t1(i INT, j INT, k INT);
INSERT INTO t1 VALUES (1,1,1),(3,3,3);
INSERT INTO t1 SELECT 10*i,j,5*j FROM t1 UNION SELECT 20*i,j,5*j FROM t1
UNION SELECT 30*i,j,5*j FROM t1;
CREATE TABLE t2 (a INT, B INT);
insert into t2 select i,j from t1;
ANALYZE TABLE t1,t2;

SELECT 1+SUM(t1.i) OVER W FROM t1 join t2 on t1.i=t2.a  WINDOW w AS (PARTITION BY t1.j ORDER BY t1.i)   ORDER BY SUM(t1.i) OVER W;
[2 Jun 2021 12:38] MySQL Verification Team
Hi Mr. wei,

Thank you for your bug report.

However, in order to further process your report, could you try writing your query, exactly as our Reference Manual specifies.

You should use parenthesis in all the right places, so the last ORDER would refer to the entire query and that it does not get ignored.

After you try that, and get the same results, would you be so kind as to let us know what would have been the right results.
[2 Jun 2021 13:02] zetao wei
The result is not sorted correctly, `124` was wrongly ordered before `92`.

mysql> SELECT 1+SUM(t1.i) OVER W FROM t1 join t2 on t1.i=t2.a  WINDOW w AS (PARTITION BY t1.j ORDER BY t1.i)   ORDER BY SUM(t1.i) OVER W;
+--------------------+
| 1+SUM(t1.i) OVER W |
+--------------------+
|                  2 |
|                  4 |
|                 12 |
|                 32 |
|                 64 |
|                 64 |
|                124 |
|                 92 |
|                 92 |
|                214 |
+--------------------+

I think the right results like this:
+--------------------+
| 1+SUM(t1.i) OVER W |
+--------------------+
|                  2 |
|                  4 |
|                 12 |
|                 32 |
|                 64 |
|                 64 |
|                 92 |
|                 92 |
|                124 |
|                214 |
+--------------------+
[3 Jun 2021 12:51] MySQL Verification Team
Hi Mr. wei,

We have tried and rewritten your query, according to our Manual, and it worked correctly.

Simply, your query is not written in a manner that would produce the desired result.

Not a bug.
[4 Jun 2021 3:58] zetao wei
Hi, thanks for your reply.

Manual: https://dev.mysql.com/doc/refman/8.0/en/window-functions-usage.html

I read the manual again, still don't understand what's wrong with my query, could you show me the correct query, thanks very much.

```
Window functions are permitted only in the select list and ORDER BY clause.
```

Order by on an alias of the window function, the result is correct.
SELECT SUM(t1.i) OVER W as sum_i FROM t1 join t2 on t1.i=t2.a  WINDOW w AS (PARTITION BY t1.j ORDER BY t1.i)   ORDER BY sum_i;

Order by on an expression of window function, the result is incorrect.
SELECT SUM(t1.i) OVER W FROM t1 join t2 on t1.i=t2.a  WINDOW w AS (PARTITION BY t1.j ORDER BY t1.i)   ORDER BY SUM(t1.i) OVER W;

I found some patterns that the aggregate window function in ORDER BY cause but it not in select list, the result may be incorrect.

mysql> SELECT row_number() over w FROM t1 join t2 on t1.i=t2.a  WINDOW w AS (PARTITION BY t1.j ORDER BY t1.i)   ORDER BY SUM(t1.i) OVER W;
+---------------------+
| row_number() over w |
+---------------------+
|                   1 |
|                   1 |
|                   2 |
|                   3 |
|                   2 |
|                   3 |
|                   4 |
|                   4 |
|                   5 |  <--- wrong
|                   5 |
+---------------------+

mysql> SELECT row_number() over w, sum(t1.i) over w as sumi FROM t1 join t2 on t1.i=t2.a  WINDOW w AS (PARTITION BY t1.j ORDER BY t1.i)   ORDER BY sumi;
+---------------------+------+
| row_number() over w | sumi |
+---------------------+------+
|                   1 |    1 |
|                   1 |    3 |
|                   2 |   11 |
|                   3 |   31 |
|                   2 |   63 |
|                   3 |   63 |
|                   4 |   91 |
|                   5 |   91 |
|                   4 |  123 |  <---- right
|                   5 |  213 |
+---------------------+------+

Suggest fix:
diff --git a/sql/sql_resolver.cc b/sql/sql_resolver.cc
index b7f91bbee46..b743b559508 100644
--- a/sql/sql_resolver.cc
+++ b/sql/sql_resolver.cc
@@ -4820,7 +4820,7 @@ bool SELECT_LEX::setup_order_final(THD *thd) {
     if (is_grouped_aggregate) continue;

     if (item->has_aggregation() ||
-        (!item->m_is_window_function && item->has_wf())) {
+        (item->m_is_window_function ? !ord->in_field_list : item->has_wf())) {
       item->split_sum_func(thd, base_ref_items, all_fields);
       if (thd->is_error()) return true; /* purecov: inspected */
     }
[8 Jun 2021 9:42] Steinar Gunderson
Verified as reported.
[8 Jun 2021 9:47] Steinar Gunderson
Fixed by

commit f2727148ea54dbc5a6acca4d0ed7beb72d9ad56a
Author: Steinar H. Gunderson <steinar.gunderson@oracle.com>
Date:   Tue Apr 20 15:23:32 2021 +0200

    Bug #32644631: PRELIMINARY FIXES FOR WL #14419 [aggressive walk, noclose]