Bug #117013 | MAX() as a Window Function Blindly Returns NULL for the First Row under Specific Conditions | ||
---|---|---|---|
Submitted: | 18 Dec 2024 14:19 | Modified: | 20 Mar 15:14 |
Reporter: | Team QueryHouse | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 8.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | Logic bug, Optimizer bug, Window functions |
[18 Dec 2024 14:19]
Team QueryHouse
[18 Dec 2024 14:50]
MySQL Verification Team
Hi Team QueryHouse, Thank you for your bug report. We get very consistent results with latest 8.0, 8.4 and 9.1. Regardless if we use MAX(c0) or MAX(c0 + 0) we always get NULL in the second column, with latest releases of all supported versions. We also consider that this is a correct result, because that is how your table and query are designed. We see no reason why should the results be otherwise, unless you come up with a very compelling explanation of why is that result wrong. Not a bug.
[19 Dec 2024 4:19]
Team QueryHouse
Thank you for your response. However, the actual result is indeed wrong, as the second column of the first row should **NOT** return `NULL`. > Regardless if we use MAX(c0) or MAX(c0 + 0) we **always get NULL** in the second column, with latest releases of all supported versions. `MAX()` returning `NULL` despite having one or more rows to evaluate is evidently a bug. The expected result for the first row (with `c0` of value `2`) is `1`, not `NULL`, since its window frame consists of one row, with the value `1`. To clarify, by using `c0 + 0` to force the correct result, we meant `MAX(c0 + 0)` for the `MAX()` function, while keeping `c0` for the `ORDER BY` clause. This ensures the issue is not reproduced and returns the correct results, as it avoids satisfying the third condition: > 3. The window frame is ordered by the same expression with the argument of max(), differing only in syntax (e.g., aliases or table references). For example, `MAX(c0 + 0) OVER (ORDER BY c0 DESC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)` returns the correct result `1`, whereas `MAX(c0 + 0) OVER (ORDER BY c0 + 0 DESC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)` does **not**, satisfying all 4 conditions we've outlined to reproduce the issue. You can verify this in other ways, e.g., by checking the number of rows within window frames for each row: ```sql mysql> SELECT c0, COUNT(*) OVER (ORDER BY c0 DESC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS cnt FROM t0; +------+-----+ | c0 | cnt | +------+-----+ | 2 | 1 | | 1 | 0 | +------+-----+ 2 rows in set (0.00 sec) ``` Or by using `MIN()` instead of `MAX()` (only in this case, since there is only one row is in the frame): ```sql mysql> SELECT c0, MIN(c0) OVER (ORDER BY c0 DESC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS min FROM t 0; +------+------+ | c0 | min | +------+------+ | 2 | 1 | | 1 | NULL | +------+------+ 2 rows in set (0.00 sec) ``` Or by using other window functions to confirm the expected behavior: ```sql mysql> SELECT c0, AVG(c0) OVER (ORDER BY c0 DESC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS avg FROM t 0; +------+--------+ | c0 | avg | +------+--------+ | 2 | 1.0000 | | 1 | NULL | +------+--------+ 2 rows in set (0.01 sec) mysql> SELECT c0, SUM(c0) OVER (ORDER BY c0 DESC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS sum FROM t 0; +------+------+ | c0 | sum | +------+------+ | 2 | 1 | | 1 | NULL | +------+------+ 2 rows in set (0.00 sec) mysql> SELECT c0, FIRST_VALUE(c0) OVER (ORDER BY c0 DESC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS firstvalue FROM t0; +------+------------+ | c0 | firstvalue | +------+------------+ | 2 | 1 | | 1 | NULL | +------+------------+ 2 rows in set (0.00 sec) mysql> SELECT c0, LAST_VALUE(c0) OVER (ORDER BY c0 DESC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS lastvalue FROM t0; +------+-----------+ | c0 | lastvalue | +------+-----------+ | 2 | 1 | | 1 | NULL | +------+-----------+ 2 rows in set (0.00 sec) mysql> SELECT c0, NTH_VALUE(c0,1) OVER (ORDER BY c0 DESC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS nth FROM t0; +------+------+ | c0 | nth | +------+------+ | 2 | 1 | | 1 | NULL | +------+------+ 2 rows in set (0.00 sec) ``` Please investigate this issue. We look forward to your response. Best regards,
[10 Jan 11:13]
MySQL Verification Team
Hi Team QueryHouse, We have re-run your query and it is truly a bug. Verified as reported. Thank you for your contribution.
[20 Mar 15:14]
Jon Stephens
Documented fix as follows in the MySQL 9.3.0 changelog: When using MAX() as a window function, it returned NULL for the first row within the window frame, despite data existing in the first row. This happened when the start of the window frame was defined using N FOLLOWING, and the frame was ordered by the same expression as the argument of MAX(), possibly differing only in syntax such as aliases or table references, in descending order. A sequence of statements demonstrating the issue is shown here: CREATE TABLE t0 (c0 INT); INSERT INTO t0 VALUES (1), (2); SELECT c0, MAX(c0) OVER (ORDER BY c0 DESC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) FROM t0; We fix this by making sure that the first row number in the frame is set in the appropriate place in the program logic. Closed.