Bug #113429 | Function call duplicated in CTE | ||
---|---|---|---|
Submitted: | 14 Dec 2023 21:13 | Modified: | 18 Dec 2023 11:42 |
Reporter: | Dale Ogilvie | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: DML | Severity: | S5 (Performance) |
Version: | 8.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | cte, FUNCTION |
[14 Dec 2023 21:13]
Dale Ogilvie
[15 Dec 2023 13:26]
MySQL Verification Team
Hi Mr. ogilvie, Thank you for your bug report. However, we are not able to repeat it. We do not have a test case, nor do we have a proof that function gets called for every row, regardless if it is filtered out or not. So, we need a full test case, including a proof that function is called always. Also, FYI, we only check bugs on the OS, not in VMs or containers ....... Can't repeat.
[18 Dec 2023 0:30]
Dale Ogilvie
Hi, I installed MySQL 8.0.35 on Windows 11, using the installer. I created a dogilvie user in the configured database. I logged in to the database as "dogilvie" using MySQL Workbench 8.0.34 I ran the following commands: CREATE DATABASE sandpit; create table sandpit.parents ( id int NOT NULL AUTO_INCREMENT, name varchar(255) NOT NULL, PRIMARY KEY (id) ); insert into sandpit.parents (name) values ("A"); insert into sandpit.parents (name) values ("B"); insert into sandpit.parents (name) values ("C"); create table sandpit.children ( id int NOT NULL AUTO_INCREMENT, name varchar(255) NOT NULL, parentId int NOT NULL, PRIMARY KEY (id), FOREIGN KEY (parentId) REFERENCES sandpit.parents(id) ); insert into sandpit.children (parentId, name) values (1, "1"); insert into sandpit.children (parentId, name) values (2, "1"); insert into sandpit.children (parentId, name) values (2, "2"); insert into sandpit.children (parentId, name) values (2, "3"); insert into sandpit.children (parentId, name) values (2, "4"); insert into sandpit.children (parentId, name) values (2, "5"); I created a function that takes one second to run: DELIMITER $$ DROP FUNCTION `expensive`$$ CREATE FUNCTION `expensive`( pParentId int ) RETURNS int DETERMINISTIC BEGIN DO SLEEP(1); RETURN 0; END$$ DELIMITER ; Then, if I run the following SQL, the query takes 5 seconds to run, indicating that the function is being called 5 times. set @pParentId = 2; WITH cte AS ( select id, sandpit.expensive(id) as result from sandpit.parents where id = @pParentId ) SELECT p.*, c.*, cte.result FROM sandpit.parents p LEFT OUTER JOIN sandpit.children c ON c.parentId = p.id JOIN cte ON cte.id = p.id WHERE p.id = @pParentId; If I add DISTINCT to the cte, the query takes only one second: WITH cte AS ( select distinct id, sandpit.expensive(id) as result from sandpit.parents where id = @pParentId ) SELECT p.*, c.*, cte.result FROM sandpit.parents p LEFT OUTER JOIN sandpit.children c ON c.parentId = p.id JOIN cte ON cte.id = p.id WHERE p.id = @pParentId; Expected result is that the first CTE query without the distinct should only be run once, not once for every row in the result set.
[18 Dec 2023 0:41]
Dale Ogilvie
The following documentation is relevant: https://dev.mysql.com/doc/refman/8.0/en/derived-table-optimization.html "If a CTE is materialized by a query, it is materialized once for the query, even if the query references it several times."
[18 Dec 2023 11:38]
MySQL Verification Team
Hi, If you read our documentation, you will find out that for each table expression a function has to be rerun, no matter what are its contents. Optimiser does not optimise stored routines created by users themselves.
[18 Dec 2023 11:42]
MySQL Verification Team
Hi Mr. Ogilvie, We have analysed your test case further and concluded that this needs to be verified. Do note that it is possible that this code bug could turn out to be a documentation bug. This is now a verified bug report.