Description:
Using multicolumn indexes might lead to rows not being fetched in a query with GROUP BY. This occurred in versions >= 5.7.
Even though this bug is similar to https://bugs.mysql.com/bug.php?id=117261, it is likely not a duplicate because:
- This bug is triggered in versions >= 5.7, but the bug in the other issue (#117261) is triggered in versions >= 8.0.
- According to the comment in the other issue (#117261), that bug is likely due to a GROUP BY with a NULL value in the MIN field, which is not the case here.
- The bug in the other issue (#117261) fails to fetch rows with null-values only, while the current bug also fails to fetch rows without null values at all. Furthermore, the current bug fails to fetch > 1 rows.
Hence, the bugs might have different root causes and could be treated separately.
How to repeat:
CREATE TABLE IF NOT EXISTS t0(c0 INT, c1 INT);
INSERT INTO t0(c0, c1) VALUES(1, 1);
INSERT INTO t0(c0, c1) VALUES(NULL, 2);
INSERT INTO t0(c0, c1) VALUES(NULL, 1);
INSERT INTO t0(c0, c1) VALUES(1, 2);
CREATE INDEX i0 ON t0(c0, c1); -- Line A
INSERT INTO t0(c0, c1) VALUES(1, 1);
(1) The table is as follows:
| c0 | c1 |
| ---- | ---- |
| null | 1 |
| null | 2 |
| 1 | 1 |
| 1 | 1 |
| 1 | 2 |
(2) In the following query, 0 rows are returned (unexpected):
SELECT
t0.c0 AS ref0,
MAX(t0.c1) AS ref1
FROM t0 WHERE (t0.c1 = 1) > 0
GROUP BY t0.c0;
| ref0 | ref1 |
| ---- | ---- |
This is unexpected because from the table in (1), the 1st, 3rd and 4th rows satisfy `(t0.c1 = 1) > 0`.
Therefore, the GROUP BY operation should be performed on those rows, hence should not return an empty result.
The expected result should be:
| ref0 | ref1 |
| ---- | ---- |
| null | 1 |
| 1 | 1 |
(3) When the index i0 is removed, 2 rows are returned (expected):
DROP INDEX i0 ON t0;
SELECT ALL
t0.c0 AS ref0,
MAX(t0.c1) AS ref1
FROM t0 WHERE (t0.c1 = 1) > 0
GROUP BY t0.c0;
| ref0 | ref1 |
| ---- | ---- |
| null | 1 |
| 1 | 1 |
Moreover, when the index i0 is created at any point before `Line A`, 2 rows are returned as expected.
According to the documentation (https://dev.mysql.com/doc/refman/8.4/en/mysql-indexes.html), indexes are used during query optimization to "find rows with specific column values quickly".
Thus, both the queries in (2) and (3) are semantically equivalent since the use of indexes should not change the result of the query execution.
Furthermore, in MySQL-compatible databases, such as TiDB (8.5), the same query produces the expected result.
| ref0 | ref1 |
| ---- | ---- |
| null | 1 |
| 1 | 1 |