Bug #103010 Subquery dependent on const tables gets evaluated for unnecessary multiple times
Submitted: 17 Mar 2021 7:33 Modified: 17 Mar 2021 11:12
Reporter: Hope Lee (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0.23 OS:Any
Assigned to: CPU Architecture:Any

[17 Mar 2021 7:33] Hope Lee
Description:
If a subquery is dependent on const tables, the optimizer will evaluate it in make_join_select(). For this kind of subquery, its results are stable and cacheable. 

But the server still keeps the subquery's SELECT_LEX_UNIT::uncacheable because it's a correlated subquery. And the server will re-evaluate it again at the execution stage, which is definitely unnecessary and bring performance loss.

How to repeat:
CREATE TABLE t1 (
  id INT PRIMARY KEY
);

CREATE TABLE t2 (
  a INT,
  b INT
);

CREATE TABLE t3 (
  c INT
);

INSERT INTO t1 VALUES (1), (2), (3), (4);
INSERT INTO t2 VALUES (4, 39), (7, 21), (0, 32), (4, 31);
INSERT INTO t3 VALUES (4), (5), (6), (7);

ANALYZE TABLE t1, t2, t3;

Set a breakpoint at Item_exists_subselect::val_bool.

(gdb) b Item_exists_subselect::val_bool

SELECT id FROM t1, t3 WHERE id = 4 AND EXISTS(SELECT SUM(b) FROM t2 WHERE a = id);

Executing the above SQL, we will hit the breakpoint the first time in make_join_select():
(gdb) bt
#0  Item_exists_subselect::val_bool (this=0x7fff380f83b0) at /mysql-server/sql/item_subselect.cc:1556
#1  0x0000000003612773 in Item_exists_subselect::val_int (this=0x7fff380f83b0) at /mysql-server/sql/item_subselect.cc:1519
#2  0x0000000003129b9a in make_join_select (join=0x7fff380f8af0, cond=(Item_cond_and *) 0x7fff38ad5ab8) at /mysql-server/sql/sql_optimizer.cc:9264
#3  0x00000000031116b4 in JOIN::optimize (this=0x7fff380f8af0) at /mysql-server/sql/sql_optimizer.cc:619
#4  0x00000000031bcceb in SELECT_LEX::optimize (this=0x7fff38ab3618, thd=0x7fff38000da0) at /mysql-server/sql/sql_select.cc:1815
#5  0x00000000032602cf in SELECT_LEX_UNIT::optimize (this=0x7fff38ab2ec8, thd=0x7fff38000da0, materialize_destination=0x0, create_iterators=true) at /mysql-server/sql/sql_union.cc:681
#6  0x00000000031ba990 in Sql_cmd_dml::execute_inner (this=0x7fff380f8500, thd=0x7fff38000da0) at /mysql-server/sql/sql_select.cc:811
#7  0x00000000031b9f8f in Sql_cmd_dml::execute (this=0x7fff380f8500, thd=0x7fff38000da0) at /mysql-server/sql/sql_select.cc:612
#8  0x000000000314328e in mysql_execute_command (thd=0x7fff38000da0, first_level=true) at /mysql-server/sql/sql_parse.cc:4407
#9  0x000000000314515e in dispatch_sql_command (thd=0x7fff38000da0, parser_state=0x7fffe292daa0) at /mysql-server/sql/sql_parse.cc:4988
#10 0x000000000313b7b0 in dispatch_command (thd=0x7fff38000da0, com_data=0x7fffe292eb40, command=COM_QUERY) at /mysql-server/sql/sql_parse.cc:1836
#11 0x0000000003139beb in do_command (thd=0x7fff38000da0) at /mysql-server/sql/sql_parse.cc:1320
#12 0x000000000330e089 in handle_connection (arg=0x865ea10) at /mysql-server/sql/conn_handler/connection_handler_per_thread.cc:301
#13 0x0000000004f479f4 in pfs_spawn_thread (arg=0xa64f8a0) at /mysql-server/storage/perfschema/pfs.cc:2900
#14 0x00007ffff7bc6e25 in start_thread () from /lib64/libpthread.so.0
#15 0x00007ffff61b4f1d in clone () from /lib64/libc.so.6

And then will hit the breakpoint multiple times at:
(gdb) bt
#0  Item_exists_subselect::val_bool (this=0x7fff380f83b0) at /mysql-server/sql/item_subselect.cc:1556
#1  0x0000000003612773 in Item_exists_subselect::val_int (this=0x7fff380f83b0) at /mysql-server/sql/item_subselect.cc:1519
#2  0x000000000381472d in FilterIterator::Read (this=0x7fff38c79800) at /mysql-server/sql/composite_iterators.cc:86
#3  0x00000000032620b7 in SELECT_LEX_UNIT::ExecuteIteratorQuery (this=0x7fff38ab2ec8, thd=0x7fff38000da0) at /mysql-server/sql/sql_union.cc:1228
#4  0x00000000032623e3 in SELECT_LEX_UNIT::execute (this=0x7fff38ab2ec8, thd=0x7fff38000da0) at /mysql-server/sql/sql_union.cc:1281
#5  0x00000000031baa2e in Sql_cmd_dml::execute_inner (this=0x7fff380f8500, thd=0x7fff38000da0) at /mysql-server/sql/sql_select.cc:827
#6  0x00000000031b9f8f in Sql_cmd_dml::execute (this=0x7fff380f8500, thd=0x7fff38000da0) at /mysql-server/sql/sql_select.cc:612
#7  0x000000000314328e in mysql_execute_command (thd=0x7fff38000da0, first_level=true) at /mysql-server/sql/sql_parse.cc:4407
#8  0x000000000314515e in dispatch_sql_command (thd=0x7fff38000da0, parser_state=0x7fffe292daa0) at /mysql-server/sql/sql_parse.cc:4988
#9  0x000000000313b7b0 in dispatch_command (thd=0x7fff38000da0, com_data=0x7fffe292eb40, command=COM_QUERY) at /mysql-server/sql/sql_parse.cc:1836
#10 0x0000000003139beb in do_command (thd=0x7fff38000da0) at /mysql-server/sql/sql_parse.cc:1320
#11 0x000000000330e089 in handle_connection (arg=0x865ea10) at /mysql-server/sql/conn_handler/connection_handler_per_thread.cc:301
#12 0x0000000004f479f4 in pfs_spawn_thread (arg=0xa64f8a0) at /mysql-server/storage/perfschema/pfs.cc:2900
#13 0x00007ffff7bc6e25 in start_thread () from /lib64/libpthread.so.0
#14 0x00007ffff61b4f1d in clone () from /lib64/libc.so.6

And the server will actually evaluate the subquery multiple times since it doesn't satisfy `(is_executed() && !uncacheable)` in SELECT_LEX_UNIT::execute(), which is unnecessary since the server has recognized the subquery is dependent on the const tables and evaluated it in make_join_select(). If the execution of subquery is time-consuming, this will bring huge performance loss.

Suggested fix:
For subqueries dependent on const tables, their results are stable and cacheable. We can mark such subqueries cacheable in make_join_select() to avoid executing them multiple times.
[17 Mar 2021 7:34] Hope Lee
The patch to optimize the server in this scenario.

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 0001-Feature-Avoid-multiple-executions-of-subquery-depend.patch (application/octet-stream, text), 3.61 KiB.

[17 Mar 2021 11:12] MySQL Verification Team
Hello Lee,

Thank you for the report and contribution.

regards,
Umesh