Bug #108851 Suspicious Estimated Rows by JOIN
Submitted: 22 Oct 2022 1:55 Modified: 22 Oct 2022 5:41
Reporter: JINSHENG BA Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0.31 OS:Ubuntu (20.04)
Assigned to: CPU Architecture:x86
Tags: cardinality estimation

[22 Oct 2022 1:55] JINSHENG BA
Description:
See this test case. The second SELECT has an additional WHERE clause than the first SELECT, but returns more estimated rows than the first SELECT.

CREATE TABLE t0(c0 INT);
CREATE TABLE t1(c0 VARCHAR(500), c1 TEXT);
INSERT INTO t0 VALUES(NULL), (1), (-2);
INSERT INTO t1 VALUES(3, NULL), (NULL, "4"), (5, ""), (NULL, "6");
ANALYZE TABLE t0 UPDATE HISTOGRAM ON c0;
ANALYZE TABLE t1 UPDATE HISTOGRAM ON c0, c1;

EXPLAIN SELECT DISTINCTROW t1.c0 FROM t0, t1; -- 3 rows
EXPLAIN SELECT DISTINCTROW t1.c0 FROM t0, t1 WHERE t1.c1 BETWEEN (SELECT 1 WHERE FALSE) AND (t1.c0); -- 4 rows

How to repeat:
$ sudo docker run -it -p 13306:3306 -e MYSQL_ROOT_PASSWORD=123456 mysql
$ mysql --host 127.0.0.1 -u root --password=123456 --port 13306

Then create a database and type the above test case.

Suggested fix:
I expect the second SELECT should return no more rows than the first SELECT.

If we remove the DISTINCTROW, the second SELECT returns no more rows than the first SELECT.
[22 Oct 2022 5:41] MySQL Verification Team
Hello Jinsheng Ba,

Thank you for the report and test case.

regards,
Umesh