| Bug #116241 | Potential Performance Improvement on TPC-H | ||
|---|---|---|---|
| Submitted: | 26 Sep 2024 10:05 | Modified: | 25 Oct 2024 11:01 |
| Reporter: | JINSHENG BA | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
| Version: | 596f0d23 (9.0.0), 9.1.0 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[27 Sep 2024 2:42]
tianfeng li
I think these two plans are the same. The only difference is whether the subselect is to be executed in optimization phase. For the first plan, subselect is executed in execution phase, thus be calculated in EXPLAIN ANALYZE, while the second plan's subselect is executed in optimization phase and its execution time was not shown in EXPLAIN ANALYZE. Try a normal execution maybe ^.^
[29 Sep 2024 8:39]
JINSHENG BA
Thanks for your reply! I tried a normal execution with a 1GB of data of TPC-H. However, the execution time fluctuates over executions. Sometimes, the original version of MYSQL is one second slower than the patched one, while I also observed similar execution time in some executions. I am not very sure, so is there any way to obtain a stable execution time?
[16 Oct 2024 6:01]
MySQL Verification Team
Hello Jinsheng Ba, Thank you for the report and feedback. My apologies for the delay. Could you please share exact make options used for the build, MySQL server configurations file used? If it is same as Bug #116309 then please ignore this request. Thank you. regards, Umesh
[25 Oct 2024 11:01]
MySQL Verification Team
Thank you for being patient with me on this. I'm not sure how much this patch fixes the issue but observed the behaviour and hope that development team would take a look at this and take a call further on the suggested patch. I'll be joining the verification results file shortly. Thank you. regards, Umesh
[25 Oct 2024 11:02]
MySQL Verification Team
9.1.0 - test results (with and without the suggested patch)
Attachment: 116241.results.txt (text/plain), 7.64 KiB.

Description: Here is a potential opportunity to improve the performance of evaluating MySQL on the TPC-H benchmark. For query 22 of the TPC-H benchmark, the performance is as follows: -> Sort: custsale.cntrycode (actual time=137..137 rows=4 loops=1) -> Table scan on <temporary> (actual time=137..137 rows=4 loops=1) -> Aggregate using temporary table (actual time=137..137 rows=4 loops=1) -> Nested loop antijoin (cost=91569 rows=734360) (actual time=50.5..134 rows=3674 loops=1) -> Filter: ((substr(CUSTOMER.C_PHONE,1,2) in ('20','40','22','30','39','42','21')) and (CUSTOMER.C_ACCTBAL > (select #3))) (cost=5417 rows=49604) (actual time=50.5..105 rows=10834 loops=1) -> Table scan on CUSTOMER (cost=5417 rows=148828) (actual time=0.0587..33.5 rows=150000 loops=1) -> Select #3 (subquery in condition; run only once) -> Aggregate: avg(CUSTOMER.C_ACCTBAL) (cost=26769 rows=1) (actual time=50.4..50.4 rows=1 loops=1) -> Filter: ((CUSTOMER.C_ACCTBAL > 0.00) and (substr(CUSTOMER.C_PHONE,1,2) in ('20','40','22','30','39','42','21'))) (cost=15340 rows=49604) (actual time=0.0361..48.8 rows=21632 loops=1) -> Table scan on CUSTOMER (cost=15340 rows=148828) (actual time=0.0306..25.9 rows=150000 loops=1) -> Covering index lookup on ORDERS using ORDERS_FK1 (O_CUSTKEY=CUSTOMER.C_CUSTKEY) (cost=3.8 rows=14.8) (actual time=0.00264..0.00264 rows=0.661 loops=10834) If we remove the code block: https://github.com/mysql/mysql-server/blob/trunk/sql/sql_const_folding.cc#L1368-L1374, the output result is the same but we can see a significant performance improvement (actual time: 84.6 vs 137): -> Sort: custsale.cntrycode (actual time=84.6..84.6 rows=4 loops=1) -> Table scan on <temporary> (actual time=84.6..84.6 rows=4 loops=1) -> Aggregate using temporary table (actual time=84.6..84.6 rows=4 loops=1) -> Nested loop antijoin (cost=101509 rows=734360) (actual time=0.0373..82.5 rows=3674 loops=1) -> Filter: ((substr(CUSTOMER.C_PHONE,1,2) in ('20','40','22','30','39','42','21')) and (CUSTOMER.C_ACCTBAL > 5001.43)) (cost=15340 rows=49604) (actual time=0.0246..53 rows=10834 loops=1) -> Table scan on CUSTOMER (cost=15340 rows=148828) (actual time=0.0221..32.3 rows=150000 loops=1) -> Covering index lookup on ORDERS using ORDERS_FK1 (O_CUSTKEY=CUSTOMER.C_CUSTKEY) (cost=3.8 rows=14.8) (actual time=0.00263..0.00263 rows=0.661 loops=10834) We believe the performance gain would be even more significant if the data volume is larger. MySQL version: 596f0d23 (9.0.0) 22.sql ``` EXPLAIN ANALYZE select cntrycode, count(*) as numcust, sum(c_acctbal) as totacctbal from ( select substring( c_phone from 1 for 2 ) as cntrycode, c_acctbal from CUSTOMER where substring( c_phone from 1 for 2 ) in ('20', '40', '22', '30', '39', '42', '21') and c_acctbal > ( select avg(c_acctbal) from CUSTOMER where c_acctbal > 0.00 and substring( c_phone from 1 for 2 ) in ('20', '40', '22', '30', '39', '42', '21') ) and not exists ( select * from ORDERS where o_custkey = c_custkey ) ) as custsale group by cntrycode order by cntrycode; ``` How to repeat: Compile MySQL in two versions, one is original and the other is with the code block removed. I have attached the patch file to remove the code for your reference. diff --git a/sql/sql_const_folding.cc b/sql/sql_const_folding.cc index 6989167c6a9..e6cbf3ddce8 100644 --- a/sql/sql_const_folding.cc +++ b/sql/sql_const_folding.cc @@ -1365,13 +1365,6 @@ bool fold_condition(THD *thd, Item *cond, Item **retcond, } } - if (!(seen_field && seen_constant)) { - /* - This comparison function doesn't have the simple form required, so - instead, try to fold inside its arguments - */ - return fold_arguments(thd, func); - } const auto arg0_orig = args[0]; const auto arg1_orig = args[1]; Then we run the query 22.sql in both versions and compare the performance. ./bin/mysql -uroot -proot tpch < 22.sql