Bug #102584 | join on result of function | ||
---|---|---|---|
Submitted: | 12 Feb 2021 15:19 | Modified: | 15 Feb 2021 15:31 |
Reporter: | Dave Pullin (Basic Quality Contributor) | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.7.31 | OS: | Any (CentOS Linux release 7.6.1810 (Core)) |
Assigned to: | CPU Architecture: | Any |
[12 Feb 2021 15:19]
Dave Pullin
[12 Feb 2021 16:09]
Dave Pullin
the select eventually completed Empty set, 65535 warnings (5 min 14.99 sec) (the warnings were Data truncated for column 'in_date' at row 1656 which is a bug in my testcase and irrevlant to the reported bug) so it is not a loop, just a terrible strategy
[15 Feb 2021 14:37]
MySQL Verification Team
Hi Mr. Pullin, Thank you, very much for your bug report. However, this is not a bug, at least not in versions that are older than 8.0. The essential problem here is that join can not be resolved by the usage of the index on the common column, since the executioner has actually to perform 1655 * 13430 comparisons, which makes 22.226.650 comparisons, which takes lots of time. There are other problems with your query ...... - Your function takes in a date type, but you feed it with integers; so it will throw error on first evaluation. - The column that you want to lookup on is not part of an index, so there is no way that you can get an index lookup. - Even in the case that there is an index, selectivity would be 100%, as the column contains the same value (NULL) for all rows. However, MySQL 8.0 actually executes this pretty efficiently without an index, due to hash join. Hash joins are not implemented in 5.7. Not a bug.
[15 Feb 2021 15:10]
Dave Pullin
I appreciate that failure to do an optimization is not a actually a bug, and I'm sorry that I introduced errors in trimming down the real life instantiation to a small test case, which did indeed make the optimization impossible. Below is the test case fixed. Note that adding 'limit 1000000000' changes the optimization and the select runs in 0.12 secs. Without it, it takes 1 min 44.36 sec. ( I don't see why the optimization is possible with the limit but not without it). begin create database if not exists bug; drop function if exists bug.func; create function bug.func(x int) returns int return 1; drop table if exists bug.table1; create table bug.table1 (pkey int auto_increment primary key,dummy int); insert into bug.table1 (dummy) select 1 from mysql.help_topic as a,mysql.help_topic as b,mysql.help_topic as c limit 13430; drop table if exists bug.table2; create table bug.table2 (item int auto_increment primary key,dummy int); insert into bug.table2 (dummy) select 1 from mysql.help_topic as a,mysql.help_topic as b,mysql.help_topic as c limit 1655; select * from (select bug.func(pkey) as item from bug.table1 limit 1000000000) as data join bug.table2 using(item); select * from (select bug.func(pkey) as item from bug.table1 ) as data join bug.table2 using(item);
[15 Feb 2021 15:17]
MySQL Verification Team
Hi Mr. Pullin, That is simply how our 5.7 optimiser works. Vast majority of these shortcomings have been fixed in 8.0 by new features.
[15 Feb 2021 15:31]
Dave Pullin
Thanks. I'm happy to have found a work around with the LIMIT to a billion rows. Unfortunately the migration to V8 is a big deal for me due to many incompatibilities.
[16 Feb 2021 12:46]
MySQL Verification Team
Hi Mr. Pullin, You are welcome ......