Bug #107460 SKIP SCAN result wrong
Submitted: 2 Jun 2022 8:07 Modified: 29 Aug 2023 9:41
Reporter: Ze Yang (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[2 Jun 2022 8:07] Ze Yang
Description:
    When IndexSkipScanIterator::Read read one group rows, will set_end_range
    when ha_read_range_first.
    And when get next group to read, it call index_next_different to read the record
    next to last group.
    But the end_range is not reset, the end range is for last group.
    When InnoDB engine search for the next record, if
    it will check whether the record can be seen. If not, then look for next. And
    when InnoDB reach the supremum record, it will check whether out of end range.
    As the skip scan end range is the last group, InnoDB will return DB_RECORD_NOT_FOUND.

How to repeat:
SELECT COUNT(1) FROM t WHERE d = 1; only get 50. While the true result should be the sum of `SELECT b, COUNT(1) FROM t WHERE d = 1 GROUP BY b;`
 
skip_scan_bug.test
```
let $engine=innodb;
--source include/skip_scan_data.inc
alter table t drop primary key;

let $i=7;
while ($i)
{
  --eval INSERT INTO t SELECT * FROM t WHERE b = 2 AND d = 1
  dec $i;
}
SET @@global.innodb_purge_stop_now = on;
EXPLAIN DELETE FROM t WHERE d = 1 and b = 2 limit 4096;
DELETE FROM t WHERE d = 1 and b = 2 limit 4096;
ANALYZE TABLE t;
EXPLAIN SELECT COUNT(1) FROM t WHERE d = 1;
SELECT COUNT(1) FROM t WHERE d = 1;
SELECT b, COUNT(1) FROM t WHERE d = 1 GROUP BY b;
SET @@global.innodb_purge_stop_now = default;
DROP TABLE t;
```

skip_scan_bug.result
```
CREATE TABLE t (a INT, b INT, c INT, d INT, e INT, PRIMARY KEY(a, b, c, d), KEY(b, d)) ENGINE=innodb;
# Since ANALYZE TABLE only reads a subset of the data, the statistics for
# table t depends on the row order. And since the INSERT INTO ... SELECT
# may be executed using different execution plans, we've added ORDER BY
# to ensure that we rows has the same order every time. If not, the
# estimated number of rows in EXPLAIN may change on different platforms.
ANALYZE TABLE t;
Table	Op	Msg_type	Msg_text
test.t	analyze	status	OK
alter table t drop primary key;
INSERT INTO t SELECT * FROM t WHERE b = 2 AND d = 1;
INSERT INTO t SELECT * FROM t WHERE b = 2 AND d = 1;
INSERT INTO t SELECT * FROM t WHERE b = 2 AND d = 1;
INSERT INTO t SELECT * FROM t WHERE b = 2 AND d = 1;
INSERT INTO t SELECT * FROM t WHERE b = 2 AND d = 1;
INSERT INTO t SELECT * FROM t WHERE b = 2 AND d = 1;
INSERT INTO t SELECT * FROM t WHERE b = 2 AND d = 1;
SET @@global.innodb_purge_stop_now = on;
EXPLAIN DELETE FROM t WHERE d = 1 and b = 2 limit 4096;
id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	DELETE	t	NULL	ALL	b	NULL	NULL	NULL	6350	100.00	Using where
Warnings:
Note	1003	delete from `test`.`t` where ((`test`.`t`.`b` = 2) and (`test`.`t`.`d` = 1)) limit 4096
DELETE FROM t WHERE d = 1 and b = 2 limit 4096;
ANALYZE TABLE t;
Table	Op	Msg_type	Msg_text
test.t	analyze	status	OK
EXPLAIN SELECT COUNT(1) FROM t WHERE d = 1;
id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	t	NULL	range	b	b	8	NULL	462	100.00	Using where; Using index for skip scan
Warnings:
Note	1003	/* select#1 */ select count(1) AS `COUNT(1)` from `test`.`t` where (`test`.`t`.`d` = 1)
SELECT COUNT(1) FROM t WHERE d = 1;
COUNT(1)
50
SELECT b, COUNT(1) FROM t WHERE d = 1 GROUP BY b;
b	COUNT(1)
1	50
2	2304
3	50
4	50
5	50
SET @@global.innodb_purge_stop_now = default;
DROP TABLE t;
```

Suggested fix:
Reset end_range before IndexSkipScanIterator::Read()-> index_next_different.

The fix code provide in private comment.
[2 Jun 2022 8:35] MySQL Verification Team
Hello Ze Yang,

Thank you for the report and feedback.

regards,
Umesh
[29 Aug 2023 9:41] MySQL Verification Team
Documented fix as follows in the MySQL 8.0.34 and 8.1.0 changelogs:
 
 
    For index skip scans, the first range read set an end-of-range
    value to indicate the end of the first range, but the next range
    read did not clear the stale end-of-range value and applies this
    stale value to the current range. Since the end-of-range value
    had already been crossed in the previous range read, this caused
    the reads to stop, causing multiple rows to be missed in the
    result.
 
    We fix this by making sure in such cases that the old
    end-of-range value is cleared.
 
 
Closed.