MySQL Bugs: #116241: Potential Performance Improvement on TPC-H

Bug #116241	Potential Performance Improvement on TPC-H
Submitted:	26 Sep 2024 10:05	Modified:	25 Oct 2024 11:01
Reporter:	JINSHENG BA	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S5 (Performance)
Version:	596f0d23 (9.0.0), 9.1.0	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
Here is a potential opportunity to improve the performance of evaluating MySQL on the TPC-H benchmark.

For query 22 of the TPC-H benchmark, the performance is as follows:

-> Sort: custsale.cntrycode  (actual time=137..137 rows=4 loops=1)
  -> Table scan on <temporary>  (actual time=137..137 rows=4 loops=1)
      -> Aggregate using temporary table  (actual time=137..137 rows=4 loops=1)
          -> Nested loop antijoin  (cost=91569 rows=734360) (actual time=50.5..134 rows=3674 loops=1)
              -> Filter: ((substr(CUSTOMER.C_PHONE,1,2) in ('20','40','22','30','39','42','21')) and (CUSTOMER.C_ACCTBAL > (select #3)))  (cost=5417 rows=49604) (actual time=50.5..105 rows=10834 loops=1)
                  -> Table scan on CUSTOMER  (cost=5417 rows=148828) (actual time=0.0587..33.5 rows=150000 loops=1)
                  -> Select #3 (subquery in condition; run only once)
                      -> Aggregate: avg(CUSTOMER.C_ACCTBAL)  (cost=26769 rows=1) (actual time=50.4..50.4 rows=1 loops=1)
                          -> Filter: ((CUSTOMER.C_ACCTBAL > 0.00) and (substr(CUSTOMER.C_PHONE,1,2) in ('20','40','22','30','39','42','21')))  (cost=15340 rows=49604) (actual time=0.0361..48.8 rows=21632 loops=1)
                              -> Table scan on CUSTOMER  (cost=15340 rows=148828) (actual time=0.0306..25.9 rows=150000 loops=1)
              -> Covering index lookup on ORDERS using ORDERS_FK1 (O_CUSTKEY=CUSTOMER.C_CUSTKEY)  (cost=3.8 rows=14.8) (actual time=0.00264..0.00264 rows=0.661 loops=10834)

If we remove the code block: https://github.com/mysql/mysql-server/blob/trunk/sql/sql_const_folding.cc#L1368-L1374, the output result is the same but we can see a significant performance improvement (actual time: 84.6 vs 137):

-> Sort: custsale.cntrycode  (actual time=84.6..84.6 rows=4 loops=1)
    -> Table scan on <temporary>  (actual time=84.6..84.6 rows=4 loops=1)
        -> Aggregate using temporary table  (actual time=84.6..84.6 rows=4 loops=1)
            -> Nested loop antijoin  (cost=101509 rows=734360) (actual time=0.0373..82.5 rows=3674 loops=1)
                -> Filter: ((substr(CUSTOMER.C_PHONE,1,2) in ('20','40','22','30','39','42','21')) and (CUSTOMER.C_ACCTBAL > 5001.43))  (cost=15340 rows=49604) (actual time=0.0246..53 rows=10834 loops=1)
                    -> Table scan on CUSTOMER  (cost=15340 rows=148828) (actual time=0.0221..32.3 rows=150000 loops=1)
                -> Covering index lookup on ORDERS using ORDERS_FK1 (O_CUSTKEY=CUSTOMER.C_CUSTKEY)  (cost=3.8 rows=14.8) (actual time=0.00263..0.00263 rows=0.661 loops=10834)

We believe the performance gain would be even more significant if the data volume is larger.

MySQL version: 596f0d23 (9.0.0)

22.sql
```
EXPLAIN ANALYZE 
select
    cntrycode,
    count(*) as numcust,
    sum(c_acctbal) as totacctbal
from
    (
        select
            substring(
                c_phone
                from
                    1 for 2
            ) as cntrycode,
            c_acctbal
        from
            CUSTOMER
        where
            substring(
                c_phone
                from
                    1 for 2
            ) in ('20', '40', '22', '30', '39', '42', '21')
            and c_acctbal > (
                select
                    avg(c_acctbal)
                from
                    CUSTOMER
                where
                    c_acctbal > 0.00
                    and substring(
                        c_phone
                        from
                            1 for 2
                    ) in ('20', '40', '22', '30', '39', '42', '21')
            )
            and not exists (
                select
                    *
                from
                    ORDERS
                where
                    o_custkey = c_custkey
            )
    ) as custsale
group by
    cntrycode
order by
    cntrycode;              
```

How to repeat:
Compile MySQL in two versions, one is original and the other is with the code block removed. I have attached the patch file to remove the code for your reference.

diff --git a/sql/sql_const_folding.cc b/sql/sql_const_folding.cc
index 6989167c6a9..e6cbf3ddce8 100644
--- a/sql/sql_const_folding.cc
+++ b/sql/sql_const_folding.cc
@@ -1365,13 +1365,6 @@ bool fold_condition(THD *thd, Item *cond, Item **retcond,
     }
   }

-  if (!(seen_field && seen_constant)) {
-    /*
-      This comparison function doesn't have the simple form required, so
-      instead, try to fold inside its arguments
-    */
-    return fold_arguments(thd, func);
-  }

   const auto arg0_orig = args[0];
   const auto arg1_orig = args[1];

Then we run the query 22.sql in both versions and compare the performance.

./bin/mysql -uroot -proot tpch < 22.sql

I think these two plans are the same.

The only difference is whether the subselect is to be executed in optimization phase. For the first plan, subselect is executed in execution phase, thus be calculated in EXPLAIN ANALYZE, while the second plan's subselect is executed in optimization phase and its execution time was not shown in EXPLAIN ANALYZE.

Try a normal execution maybe ^.^

Thanks for your reply!

I tried a normal execution with a 1GB of data of TPC-H. However, the execution time fluctuates over executions. Sometimes, the original version of MYSQL is one second slower than the patched one, while I also observed similar execution time in some executions.

I am not very sure, so is there any way to obtain a stable execution time?

Hello Jinsheng Ba,

Thank you for the report and feedback.
My apologies for the delay. Could you please share exact make options used for the build, MySQL server configurations file used? If it is same as Bug #116309 then please ignore this request. Thank you.

regards,
Umesh

Thank you for being patient with me on this.
I'm not sure how much this patch fixes the issue but observed the behaviour and hope that development team would take a look at this and take a call further on the suggested patch.  I'll be joining the verification results file shortly. Thank you.

regards,
Umesh

9.1.0 - test results (with and without the suggested patch)

Attachment: 116241.results.txt (text/plain), 7.64 KiB.