Bug #116456 Performance of TPC-H Query 8
Submitted: 23 Oct 2024 6:43 Modified: 31 Oct 2024 14:53
Reporter: JINSHENG BA Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:596f0d23 (9.0.0), 9.1.0 OS:Any
Assigned to: CPU Architecture:Any

[23 Oct 2024 6:43] JINSHENG BA
Description:
A suboptimal query plan is chosen for executing query 8 in TPC-H benchmark. 

$ mysql tpch < 8.sql
EXPLAIN
-> Sort: all_nations.o_year  (actual time=49028..49028 rows=2 loops=1)
    -> Table scan on <temporary>  (actual time=49027..49027 rows=2 loops=1)
        -> Aggregate using temporary table  (actual time=49027..49027 rows=2 loops=1)
            -> Nested loop inner join  (cost=797653 rows=2808) (actual time=16.5..49003 rows=4838 loops=1)
                -> Nested loop inner join  (cost=699638 rows=28080) (actual time=1.53..29514 rows=731894 loops=1)
                    -> Nested loop inner join  (cost=633644 rows=28080) (actual time=1.53..28480 rows=731894 loops=1)
                        -> Nested loop inner join  (cost=567649 rows=28080) (actual time=1.52..26756 rows=731894 loops=1)
                            -> Nested loop inner join  (cost=500635 rows=7134) (actual time=1.09..15988 rows=182721 loops=1)
                                -> Nested loop inner join  (cost=416799 rows=35672) (actual time=1.08..14547 rows=911408 loops=1)
                                    -> Inner hash join (no condition)  (cost=313494 rows=35672) (actual time=1.06..2834 rows=911408 loops=1)
                                        -> Filter: (ORDERS.O_ORDERDATE between DATE'1995-01-01' and DATE'1996-12-31')  (cost=313492 rows=321078) (actual time=0.621..2631 rows=911408 loops=1)
                                            -> Table scan on ORDERS  (cost=313492 rows=2.89e+6) (actual time=0.619..2388 rows=3e+6 loops=1)
                                        -> Hash
                                            -> Filter: (REGION.R_NAME = 'ASIA')  (cost=1.5 rows=1) (actual time=0.43..0.433 rows=1 loops=1)
                                                -> Table scan on REGION  (cost=1.5 rows=5) (actual time=0.426..0.43 rows=5 loops=1)
                                    -> Single-row index lookup on CUSTOMER using PRIMARY (C_CUSTKEY=ORDERS.O_CUSTKEY)  (cost=0.311 rows=1) (actual time=0.0126..0.0126 rows=1 loops=911408)
                                -> Filter: (n1.N_REGIONKEY = REGION.R_REGIONKEY)  (cost=0.25 rows=0.2) (actual time=0.00141..0.00143 rows=0.2 loops=911408)
                                    -> Single-row index lookup on n1 using PRIMARY (N_NATIONKEY=CUSTOMER.C_NATIONKEY)  (cost=0.25 rows=1) (actual time=0.00114..0.00117 rows=1 loops=911408)
                            -> Index lookup on LINEITEM using PRIMARY (L_ORDERKEY=ORDERS.O_ORDERKEY)  (cost=1 rows=3.94) (actual time=0.0561..0.0585 rows=4.01 loops=182721)
                        -> Single-row index lookup on SUPPLIER using PRIMARY (S_SUPPKEY=LINEITEM.L_SUPPKEY)  (cost=0.25 rows=1) (actual time=0.00211..0.00215 rows=1 loops=731894)
                    -> Single-row index lookup on n2 using PRIMARY (N_NATIONKEY=SUPPLIER.S_NATIONKEY)  (cost=0.25 rows=1) (actual time=0.00118..0.00121 rows=1 loops=731894)
                -> Filter: (PART.P_TYPE = 'SMALL PLATED COPPER')  (cost=0.377 rows=0.1) (actual time=0.0265..0.0265 rows=0.00661 loops=731894)
                    -> Single-row index lookup on PART using PRIMARY (P_PARTKEY=LINEITEM.L_PARTKEY)  (cost=0.377 rows=1) (actual time=0.0261..0.0261 rows=1 loops=731894)

0.04s user 0.02s system 0% cpu 49.285 total

If we remove the IF in https://github.com/mysql/mysql-server/blob/trunk/sql/sql_planner.cc#L1870, we can see a performance improvement:
$ mysql tpch < 8.sql
EXPLAIN
-> Sort: all_nations.o_year  (actual time=44241..44241 rows=2 loops=1)
    -> Table scan on <temporary>  (actual time=44241..44241 rows=2 loops=1)
        -> Aggregate using temporary table  (actual time=44241..44241 rows=2 loops=1)
            -> Nested loop inner join  (cost=687806 rows=2808) (actual time=16.3..44218 rows=4838 loops=1)
                -> Nested loop inner join  (cost=681207 rows=2808) (actual time=16.3..44208 rows=4838 loops=1)
                    -> Nested loop inner join  (cost=658854 rows=2808) (actual time=16.3..43572 rows=4838 loops=1)
                        -> Nested loop inner join  (cost=563211 rows=28080) (actual time=1.68..25522 rows=731894 loops=1)
                            -> Nested loop inner join  (cost=496226 rows=7134) (actual time=1.19..14679 rows=182721 loops=1)
                                -> Nested loop inner join  (cost=412389 rows=35672) (actual time=1.17..13153 rows=911408 loops=1)
                                    -> Inner hash join (no condition)  (cost=313494 rows=35672) (actual time=1.15..2908 rows=911408 loops=1)
                                        -> Filter: (ORDERS.O_ORDERDATE between DATE'1995-01-01' and DATE'1996-12-31')  (cost=313492 rows=321078) (actual time=0.719..2732 rows=911408 loops=1)
                                            -> Table scan on ORDERS  (cost=313492 rows=2.89e+6) (actual time=0.717..2489 rows=3e+6 loops=1)
                                        -> Hash
                                            -> Filter: (REGION.R_NAME = 'ASIA')  (cost=1.5 rows=1) (actual time=0.418..0.421 rows=1 loops=1)
                                                -> Table scan on REGION  (cost=1.5 rows=5) (actual time=0.413..0.417 rows=5 loops=1)
                                    -> Single-row index lookup on CUSTOMER using PRIMARY (C_CUSTKEY=ORDERS.O_CUSTKEY)  (cost=0.297 rows=1) (actual time=0.011..0.011 rows=1 loops=911408)
                                -> Filter: (n1.N_REGIONKEY = REGION.R_REGIONKEY)  (cost=0.25 rows=0.2) (actual time=0.00149..0.00151 rows=0.2 loops=911408)
                                    -> Single-row index lookup on n1 using PRIMARY (N_NATIONKEY=CUSTOMER.C_NATIONKEY)  (cost=0.25 rows=1) (actual time=0.00119..0.00122 rows=1 loops=911408)
                            -> Index lookup on LINEITEM using PRIMARY (L_ORDERKEY=ORDERS.O_ORDERKEY)  (cost=0.999 rows=3.94) (actual time=0.0565..0.0588 rows=4.01 loops=182721)
                        -> Filter: (PART.P_TYPE = 'SMALL PLATED COPPER')  (cost=0.367 rows=0.1) (actual time=0.0245..0.0245 rows=0.00661 loops=731894)
                            -> Single-row index lookup on PART using PRIMARY (P_PARTKEY=LINEITEM.L_PARTKEY)  (cost=0.367 rows=1) (actual time=0.0241..0.0242 rows=1 loops=731894)
                    -> Single-row index lookup on SUPPLIER using PRIMARY (S_SUPPKEY=LINEITEM.L_SUPPKEY)  (cost=0.873 rows=1) (actual time=0.131..0.131 rows=1 loops=4838)
                -> Single-row index lookup on n2 using PRIMARY (N_NATIONKEY=SUPPLIER.S_NATIONKEY)  (cost=0.25 rows=1) (actual time=0.00175..0.00179 rows=1 loops=4838)

0.03s user 0.02s system 0% cpu 44.423 total

Both execution time and cost number are reduced by around 10%.
The only difference between two query plans is the joining order of table SUPPLIER and PART. I am wondering whether we can optimize the code to enable the more efficient joining order in default.

Query 8:
EXPLAIN ANALYZE
select
    o_year,
    sum(
        case
            when nation = 'INDIA' then volume
            else 0
        end
    ) / sum(volume) as mkt_share
from
    (
        select
            extract(
                year
                from
                    o_orderdate
            ) as o_year,
            l_extendedprice * (1 - l_discount) as volume,
            n2.n_name as nation
        from
            PART,
            SUPPLIER,
            LINEITEM,
            ORDERS,
            CUSTOMER,
            NATION n1,
            NATION n2,
            REGION
        where
            p_partkey = l_partkey
            and s_suppkey = l_suppkey
            and l_orderkey = o_orderkey
            and o_custkey = c_custkey
            and c_nationkey = n1.n_nationkey
            and n1.n_regionkey = r_regionkey
            and r_name = 'ASIA'
            and s_nationkey = n2.n_nationkey
            and o_orderdate between date '1995-01-01'
            and date '1996-12-31'
            and p_type = 'SMALL PLATED COPPER'
    ) as all_nations
group by
    o_year
order by
    o_year;

How to repeat:
TPC-H data scale: 2GB
The data is exactly the same as the one used in https://bugs.mysql.com/bug.php?id=116309

Compile MySQL two versions, one is original and the other is with the changed IF condition. I attach the patch file to remove the code for your reference.

Patch:
diff --git a/sql/sql_planner.cc b/sql/sql_planner.cc
index ad0c639abca..c415ad48b71 100644
--- a/sql/sql_planner.cc
+++ b/sql/sql_planner.cc
@@ -1867,7 +1867,6 @@ bool Join_tab_compare_default::operator()(const JOIN_TAB *jt1,
   if (jt1_keydep_jt2 && !jt2_keydep_jt1) return false;
   if (jt2_keydep_jt1 && !jt1_keydep_jt2) return true;
 
-  if (jt1->found_records > jt2->found_records) return false;
   if (jt1->found_records < jt2->found_records) return true;
 
   return jt1 < jt2;

I observed that the number of cost and execution time fluctuate across executions, so I typically run the query multiple times to get a stable result.
[31 Oct 2024 14:53] MySQL Verification Team
Hello Jinsheng Ba,

Thank you for the report and feedback.
I'm not sure how much this patch fixes the issue but observed the behaviour and hope that development team would take a look at this and take a call further on the suggested patch.  I'll be joining the verification results file shortly. Thank you.

regards,
Umesh