Bug #87760 DENSE_RANK wrong with buffered processing and partitions
Submitted: 14 Sep 2017 16:14 Modified: 1 Dec 2017 14:22
Reporter: Dag Wanvik Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.0.3 OS:Any
Assigned to: CPU Architecture:Any

[14 Sep 2017 16:14] Dag Wanvik
Description:
Results are one too high:

mysql> SELECT department_id, salary,
    ->        RANK() OVER w AS rnk,
    ->        DENSE_RANK() OVER w AS dense,
    ->        NTILE(4) over w AS ntil,
    ->        CUME_DIST() OVER w AS cume,
    ->        PERCENT_RANK() over w AS `%rnk`
    ->   FROM employee
    ->   WINDOW w AS (PARTITION BY department_id
    ->                ORDER BY salary DESC)
    ->   ORDER BY department_id, salary DESC;
+---------------+--------+-----+-------+------+--------------------+------+
| department_id | salary | rnk | dense | ntil | cume               | %rnk |
+---------------+--------+-----+-------+------+--------------------+------+
|          NULL |  75000 |   1 |     2 |    1 |                  1 |    0 |
|            10 | 100000 |   1 |     2 |    1 |                0.2 |    0 |
|            10 |  70000 |   2 |     3 |    1 |                0.4 | 0.25 |
|            10 |  60000 |   3 |     4 |    2 |                0.8 |  0.5 |
|            10 |  60000 |   3 |     4 |    3 |                0.8 |  0.5 |
|            10 |   NULL |   5 |     5 |    4 |                  1 |    1 |
|            20 |  80000 |   1 |     2 |    1 | 0.3333333333333333 |    0 |
|            20 |  65000 |   2 |     3 |    2 |                  1 |  0.5 |
|            20 |  65000 |   2 |     3 |    3 |                  1 |  0.5 |
|            30 | 300000 |   1 |     2 |    1 |                0.5 |    0 |
|            30 |  70000 |   2 |     3 |    2 |                  1 |    1 |
+---------------+--------+-----+-------+------+--------------------+------+

How to repeat:
CREATE TABLE employee(department_id INT, salary INT);

INSERT INTO employee VALUES (10, NULL),
                            (10, 100000),
                            (10, 60000),
                            (10, 60000),
                            (10, 70000),
                            (20, 80000),
                            (20, 65000),
                            (20, 65000),
                            (30, 300000),
                            (30, 70000),
                            (NULL, 75000);
SELECT department_id, salary,
       RANK() OVER w AS rnk,
       DENSE_RANK() OVER w AS dense,
       NTILE(4) over w AS ntil,
       CUME_DIST() OVER w AS cume,
       PERCENT_RANK() over w AS `%rnk`
  FROM employee
  WINDOW w AS (PARTITION BY department_id
               ORDER BY salary DESC)
  ORDER BY department_id, salary DESC;
[14 Sep 2017 17:06] Dag Wanvik
Posted by developer:
 
It's wrong even with the default partition; changed the title. It's only dependent on whether we use a frame buffer or not.
[1 Dec 2017 14:22] Jon Stephens
Documented fix in the MySQL 8.0.4 changelog as follows:

    DENSE_RANK() did not work correctly for the first row in a partition
    when buffering was in use, due to premature initialization of the cache
    comparator for ORDER BY.

Closed.