Bug #101691 Use substring_index and rank() in a sql cause wrong result
Submitted: 20 Nov 2020 7:25 Modified: 20 Nov 2020 7:45
Reporter: Baolin Huang Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.22, 8.0.17, 8.0.18 OS:Any
Assigned to: CPU Architecture:Any
Tags: regression

[20 Nov 2020 7:25] Baolin Huang
Description:
See a sql with group by/order by/rank: `name2` returns a wrong value '15';

In addition, when test with 8.0.16, there is no such problem. 

mysql> select
    -> group_concat(name order by value2) as name1,
    -> substring_index(group_concat(name order by value2),',',1) as name2,
    -> rank() over (order by sum(value1) desc, max(value2) asc ) as `rank`
    -> from tt2  group by id1,id2 order by sum(value1) desc, max(value2) asc;
+-------+-------+------+
| name1 | name2 | rank |
+-------+-------+------+
| 24,14 | 15    |    1 |
| 23,13 | 15    |    2 |
| 22,12 | 15    |    3 |
| 21,11 | 15    |    4 |
| 10,20 | 15    |    5 |
| 19    | 15    |    6 |
| 18    | 15    |    7 |
| 17    | 15    |    8 |
| 16    | 15    |    9 |
| 15    | 15    |   10 |
+-------+-------+------+

How to repeat:
```
create table tt2 (id1 int, id2 int, value1 int, value2 int, name varchar(32));
insert into tt2 values
(30,00, 10, 10, '10'),
(60,30, 13, 97, '11'),
(90,60, 16, 94, '12'),
(20,90, 19, 91, '13'),
(50,20, 22, 88, '14'),
(80,50, 25, 85, '15'),
(10,80, 28, 82, '16'),
(40,10, 31, 79, '17'),
(70,40, 34, 76, '18'),
(00,70, 37, 73, '19'),
(30,00, 40, 70, '20'),
(60,30, 43, 67, '21'),
(90,60, 46, 64, '22'),
(20,90, 49, 61, '23'),
(50,20, 52, 58, '24');

select id1,id2, sum(value1), max(value2), substring_index(group_concat(name order by value2),',',1), rank() over (order by sum(value1) desc, max(value2) asc ) as `rank` from tt2  group by id1,id2 order by sum(value1) desc, max(value2) asc;

```

Suggested fix:
Is this a bug?
Or whether there is any rule when using rank() function.
[20 Nov 2020 7:45] MySQL Verification Team
Hello Baolin Huang,

Thank you for the report and feedback.

regards,
Umesh
[5 Nov 2021 13:48] Steinar Gunderson
Posted by developer:
 
Fixed by

commit ba373ed2e3b5d9b0465e1d1e6240db84c2e237fe
Author: Dag Wanvik <dag.wanvik@oracle.com>
Date:   Tue Sep 28 13:21:53 2021 +0200

    Bug#33399696 window function returns wrong result
    
    This is a regression introduced in 8.0.17 (by 88539ef3e67, part of WL#12788:
    Iterator executor for window functions).
    
    In the WindowIterator::Read() function, resetting of active slice is
    not done when we reach the end of the input result set for windowing.
    
    This causes reading wrong values when we reach the sorting of ORDER
    BY: active slice is still set to slice 1 which points to the items
    being read from the input table, whereas the ORDER BY sorting stage
    needs to read the values after computation of the window
    function(s). For this, it needs the active slice to be that of the
    last window's output table, slice 4 in this case.
    
    Fix: move the resetting of the slice to the output slice immediately
    after read, so that it is set correctly also when we return due to no
    more rows seen in the input set and move on to the ordering.
    
    Patch contributed by casazhang@tencent.com
    
    Change-Id: I7393075ebdbb37fb1966eeff72223caf38316457