Bug #113429 Function call duplicated in CTE
Submitted: 14 Dec 2023 21:13 Modified: 18 Dec 2023 11:42
Reporter: Dale Ogilvie Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DML Severity:S5 (Performance)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: cte, FUNCTION

[14 Dec 2023 21:13] Dale Ogilvie
Description:
MySQL is calling a function once for every row in the query result set, even when the function is called in a CTE which only returns one row.

The expected behaviour is for MySQL to call the function once only.

How to repeat:
See stack overflow question:

https://stackoverflow.com/questions/77581941/mysql-cte-function-call-duplicated

Suggested fix:
Do not duplicate function calls unnecessarily.
[15 Dec 2023 13:26] MySQL Verification Team
Hi Mr. ogilvie,

Thank you for your bug report.

However, we are not able to repeat it.

We do not have a test case, nor do we have a proof that function gets called for every row, regardless if it is filtered out or not.

So, we need a full test case, including a proof that function is called always.

Also, FYI, we only check bugs on the OS, not in VMs or containers .......

Can't repeat.
[18 Dec 2023 0:30] Dale Ogilvie
Hi,

I installed MySQL 8.0.35 on Windows 11, using the installer. I created a dogilvie user in the configured database.

I logged in to the database as "dogilvie" using MySQL Workbench 8.0.34

I ran the following commands:

CREATE DATABASE sandpit;

create table sandpit.parents (
    id int NOT NULL AUTO_INCREMENT,
    name varchar(255) NOT NULL,
    PRIMARY KEY (id)
);

insert into sandpit.parents (name) values ("A");
insert into sandpit.parents (name) values ("B");
insert into sandpit.parents (name) values ("C");

create table sandpit.children (
    id int NOT NULL AUTO_INCREMENT,
    name varchar(255) NOT NULL,
    parentId int NOT NULL,
    PRIMARY KEY (id),
    FOREIGN KEY (parentId) REFERENCES sandpit.parents(id)
);

insert into sandpit.children (parentId, name) values (1, "1");
insert into sandpit.children (parentId, name) values (2, "1");
insert into sandpit.children (parentId, name) values (2, "2");
insert into sandpit.children (parentId, name) values (2, "3");
insert into sandpit.children (parentId, name) values (2, "4");
insert into sandpit.children (parentId, name) values (2, "5");

I created a function that takes one second to run:

DELIMITER $$
DROP FUNCTION `expensive`$$
CREATE FUNCTION `expensive`(
  pParentId                int 
) RETURNS int DETERMINISTIC
BEGIN

  DO SLEEP(1);
  
  RETURN 0;
  
END$$
DELIMITER ;

Then, if I run the following SQL, the query takes 5 seconds to run, indicating that the function is being called 5 times.

set @pParentId = 2;

WITH cte AS
(
   select id, sandpit.expensive(id) as result from sandpit.parents where id = @pParentId
)
SELECT p.*, c.*, cte.result
FROM sandpit.parents p
LEFT OUTER JOIN sandpit.children c ON c.parentId = p.id
JOIN cte ON cte.id = p.id
WHERE p.id = @pParentId;

If I add DISTINCT to the cte, the query takes only one second:

WITH cte AS
(
   select distinct id, sandpit.expensive(id) as result from sandpit.parents where id = @pParentId
)
SELECT p.*, c.*, cte.result
FROM sandpit.parents p
LEFT OUTER JOIN sandpit.children c ON c.parentId = p.id
JOIN cte ON cte.id = p.id
WHERE p.id = @pParentId;

Expected result is that the first CTE query without the distinct should only be run once, not once for every row in the result set.
[18 Dec 2023 0:41] Dale Ogilvie
The following documentation is relevant:

https://dev.mysql.com/doc/refman/8.0/en/derived-table-optimization.html

"If a CTE is materialized by a query, it is materialized once for the query, even if the query references it several times."
[18 Dec 2023 11:38] MySQL Verification Team
Hi,

If you read our documentation, you will find out that for each table expression a function has to be rerun, no matter what are its contents.

Optimiser does not optimise stored routines created by users themselves.
[18 Dec 2023 11:42] MySQL Verification Team
Hi Mr. Ogilvie,

We have analysed your test case further and concluded that this needs to be verified.

Do note that it is possible that this code bug could turn out to be a documentation bug.

This is now a verified bug report.