Bug #90599 Window function with RAND() is not random with GROUP BY
Submitted: 24 Apr 2018 9:41 Modified: 9 Nov 2018 12:01
Reporter: Daniël van Eeden (OCA) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:8.0.11 OS:Any
Assigned to: CPU Architecture:Any

[24 Apr 2018 9:41] Daniël van Eeden
Description:
Using "partition by <column> order by rand()" as window works.
But with a GROUP BY the value is not random anymore.
Putting it in a subquery "fixes" it.

How to repeat:
CREATE TABLE t1 (c1 int not null, c2 int not null, primary key (c1, c2));
INSERT INTO t1 VALUES (1,10),(1,20),(1,30),(1,40),(1,50),(2,10),(2,20),(2,30),(2,40),(2,50);

-- Shows different value for random_c2 when ran multiple times
SELECT c1, first_value(c2) over w 'random_c2' from t1 window w as (partition by c1 order by rand());

set session sql_mode='';

-- Shows the same row when ran multiple times. 
SELECT c1, first_value(c2) over w 'random_c2' from t1 group by c1 window w as (partition by c1 order by rand());

-- This does show a random value for random_c2
SELECT * FROM (SELECT c1, first_value(c2) over w 'random_c2' from t1 window w as (partition by c1 order by rand())) a GROUP BY c1;
[24 Apr 2018 9:55] MySQL Verification Team
Hello Daniël,

Thank you for the report.

Thanks,
Umesh
[1 Nov 2018 13:33] Dag Wanvik
I think this is not a bug. The query which the reporter objects to is:

SELECT c1, first_value(c2) over w 'random_c2' from t1 group by c1 window w as
(partition by c1 order by rand());

Since we group by c1, there will be only one line per partition for
the window w as defined, since the partition key is c1.  That means
there is only one line to sort for the window specification "ORDER BY
RAND()".

However, in the group by itself, the column c2 is not functionally
dependent on the grouping column (the query is rejected with
SQL_MODE=ONLY_FULL_GROUP_BY), so the value of c2 will be
non-deterministic. Grouping may use sorting for its implementation,
but that is not the sorting(order by) we have specified rand() for.