Description:
Inconsistent window aggregate results with GROUP BY ... WITH ROLLUP:
SUM(<execution-time-constant expression>) OVER() does not match the row
multiplicity implied by COUNT(*) OVER() in the same SELECT for some grouping
forms (e.g. GROUP BY UTC_DATE() or GROUP BY literal 1), while other grouping
forms (e.g. GROUP BY a table column) yield self-consistent SUM(1) and COUNT(*).
How to repeat:
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (f1 INT NOT NULL);
INSERT INTO t1 VALUES (1), (2);
SET timestamp = UNIX_TIMESTAMP('2009-04-16 12:46:00');
-- Query A
SELECT UTC_DATE() AS g,
SUM(UNIX_TIMESTAMP(UTC_DATE())) OVER () AS sum_ts,
COUNT(*) OVER () AS cnt
FROM t1
GROUP BY UTC_DATE() WITH ROLLUP;
-- Query B
SELECT 1 AS g,
SUM(UNIX_TIMESTAMP(UTC_DATE())) OVER () AS sum_ts,
COUNT(*) OVER () AS cnt
FROM t1
GROUP BY 1 WITH ROLLUP;
SET timestamp = DEFAULT;
-- Query C
SELECT 1 AS g,
SUM(1) OVER () AS sum_one,
COUNT(*) OVER () AS cnt
FROM t1
GROUP BY 1 WITH ROLLUP;
-- Query D
SELECT f1 AS g,
SUM(1) OVER () AS sum_one,
COUNT(*) OVER () AS cnt
FROM t1
GROUP BY f1 WITH ROLLUP;
ACTUAL RESULTS (as executed on 9.6.0 with the sql_mode above):
Query A — GROUP BY UTC_DATE() WITH ROLLUP:
+------------+------------+-----+
| g | sum_ts | cnt |
+------------+------------+-----+
| 2009-04-16 | 1239840000 | 2 |
| NULL | 1239840000 | 2 |
+------------+------------+-----+
Query B — GROUP BY 1 WITH ROLLUP (same window expression as A, only GROUP BY changed):
+------+------------+-----+
| g | sum_ts | cnt |
+------+------------+-----+
| 1 | 2479680000 | 2 |
| NULL | 2479680000 | 2 |
+------+------------+-----+
Note: 2479680000 = 2 * 1239840000. So Query B matches "sum once per row in
the window" for two rows, while Query A leaves sum_ts at a single-row value
though cnt is 2.
Query C — GROUP BY 1 WITH ROLLUP:
+------+---------+-----+
| g | sum_one | cnt |
+------+---------+-----+
| 1 | 1 | 2 |
| NULL | 1 | 2 |
+------+---------+-----+
Query D — GROUP BY f1 WITH ROLLUP:
+------+---------+-----+
| g | sum_one | cnt |
+------+---------+-----+
| 1 | 3 | 3 |
| 2 | 3 | 3 |
| NULL | 3 | 3 |
+------+---------+-----+
EXPECTED BEHAVIOR:
For a given window partition, SUM of a per-row constant should equal that
constant times the number of rows in the partition, and should be consistent
with COUNT(*) OVER() in the same way as in Query B and Query D. In particular,
for Query A, since cnt is 2 on each result row, sum_ts should be 2479680000
(2 * UNIX_TIMESTAMP(UTC_DATE()) for the fixed session timestamp), not 1239840000.
Similarly, for Query C, sum_one should be 2 when cnt is 2, not 1.
Description: Inconsistent window aggregate results with GROUP BY ... WITH ROLLUP: SUM(<execution-time-constant expression>) OVER() does not match the row multiplicity implied by COUNT(*) OVER() in the same SELECT for some grouping forms (e.g. GROUP BY UTC_DATE() or GROUP BY literal 1), while other grouping forms (e.g. GROUP BY a table column) yield self-consistent SUM(1) and COUNT(*). How to repeat: DROP TABLE IF EXISTS t1; CREATE TABLE t1 (f1 INT NOT NULL); INSERT INTO t1 VALUES (1), (2); SET timestamp = UNIX_TIMESTAMP('2009-04-16 12:46:00'); -- Query A SELECT UTC_DATE() AS g, SUM(UNIX_TIMESTAMP(UTC_DATE())) OVER () AS sum_ts, COUNT(*) OVER () AS cnt FROM t1 GROUP BY UTC_DATE() WITH ROLLUP; -- Query B SELECT 1 AS g, SUM(UNIX_TIMESTAMP(UTC_DATE())) OVER () AS sum_ts, COUNT(*) OVER () AS cnt FROM t1 GROUP BY 1 WITH ROLLUP; SET timestamp = DEFAULT; -- Query C SELECT 1 AS g, SUM(1) OVER () AS sum_one, COUNT(*) OVER () AS cnt FROM t1 GROUP BY 1 WITH ROLLUP; -- Query D SELECT f1 AS g, SUM(1) OVER () AS sum_one, COUNT(*) OVER () AS cnt FROM t1 GROUP BY f1 WITH ROLLUP; ACTUAL RESULTS (as executed on 9.6.0 with the sql_mode above): Query A — GROUP BY UTC_DATE() WITH ROLLUP: +------------+------------+-----+ | g | sum_ts | cnt | +------------+------------+-----+ | 2009-04-16 | 1239840000 | 2 | | NULL | 1239840000 | 2 | +------------+------------+-----+ Query B — GROUP BY 1 WITH ROLLUP (same window expression as A, only GROUP BY changed): +------+------------+-----+ | g | sum_ts | cnt | +------+------------+-----+ | 1 | 2479680000 | 2 | | NULL | 2479680000 | 2 | +------+------------+-----+ Note: 2479680000 = 2 * 1239840000. So Query B matches "sum once per row in the window" for two rows, while Query A leaves sum_ts at a single-row value though cnt is 2. Query C — GROUP BY 1 WITH ROLLUP: +------+---------+-----+ | g | sum_one | cnt | +------+---------+-----+ | 1 | 1 | 2 | | NULL | 1 | 2 | +------+---------+-----+ Query D — GROUP BY f1 WITH ROLLUP: +------+---------+-----+ | g | sum_one | cnt | +------+---------+-----+ | 1 | 3 | 3 | | 2 | 3 | 3 | | NULL | 3 | 3 | +------+---------+-----+ EXPECTED BEHAVIOR: For a given window partition, SUM of a per-row constant should equal that constant times the number of rows in the partition, and should be consistent with COUNT(*) OVER() in the same way as in Query B and Query D. In particular, for Query A, since cnt is 2 on each result row, sum_ts should be 2479680000 (2 * UNIX_TIMESTAMP(UTC_DATE()) for the fixed session timestamp), not 1239840000. Similarly, for Query C, sum_one should be 2 when cnt is 2, not 1.