Bug #30363 TPC-D complex query slower by 2.6x to 6x times in 5.2 then in 5.1
Submitted: 10 Aug 2007 17:58 Modified: 20 Nov 2010 18:10
Reporter: Omer Barnir (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S5 (Performance)
Version:5.2.4 OS:Any
Assigned to: Sergey Petrunya CPU Architecture:Any

[10 Aug 2007 17:58] Omer Barnir
Description:
While running comparison runs of TPC-D tests against 5.0.46, 5.1.20 and 5.2.4, the following query 
  SELECT A.YEAR, A.VOLUME/B.VOLUME AS MKT_SHARE 
  FROM SUM_NATION1 A, SUM_REGION1 B 
  WHERE A.YEAR = B.YEAR ORDER BY A.YEAR;

where the above views are defined as:

CREATE VIEW `ALL_NATIONS1` AS select year(`dss_order`.`o_orderdate`) AS `YEAR`,
(`dss_lineitem`.`l_extendedprice` * (1 - `dss_lineitem`.`l_discount`)) AS `VOLUME`,
`N2`.`n_name` AS `NATION` from (((((((`dss_part` join `dss_supplier`) join `dss_lineitem`) join `dss_order`) join `dss_customer`)
 join `dss_nation` `N1`) join `dss_nation` `N2`) join `dss_region`) where ((`dss_part`.`p_partkey` = `dss_lineitem`.`l_partkey`) 
and (`dss_supplier`.`s_suppkey` = `dss_lineitem`.`l_suppkey`) and (`dss_lineitem`.`l_orderkey` = `dss_order`.`o_orderkey`) 
and (`dss_order`.`o_custkey` = `dss_customer`.`c_custkey`) and (`dss_customer`.`c_nationkey` = `N1`.`n_nationkey`) 
and (`N1`.`n_regionkey` = `dss_region`.`r_regionkey`) and (`dss_region`.`r_name` = _latin1'AMERICA') 
and (`dss_supplier`.`s_nationkey` = `N2`.`n_nationkey`) and (`dss_order`.`o_orderdate` between _latin1'1995-01-01' and _latin1'1996-12-31') 
and (`dss_part`.`p_type` = _latin1'ECONOMY ANODIZED STEEL'));

CREATE VIEW `SUM_NATION1` AS select `ALL_NATIONS1`.`YEAR` AS `YEAR`,sum(`ALL_NATIONS1`.`VOLUME`) AS `VOLUME` 
from `ALL_NATIONS1` where (`ALL_NATIONS1`.`NATION` = _latin1'BRAZIL') group by `ALL_NATIONS1`.`YEAR`; 

CREATE VIEW `SUM_REGION1` AS select `ALL_NATIONS1`.`YEAR` AS `YEAR`,sum(`ALL_NATIONS1`.`VOLUME`) AS `VOLUME` 
from `ALL_NATIONS1` group by `ALL_NATIONS1`.`YEAR`;

was 6.0 times slower in 5.2.4 vs. 5.0.46 and 5.1.20 when running as part of the Quest Benchmark tool (client and server on different 'desktop' machines using ODBC connection).

When isolating the query and running it using the 'mysql' client against a local server using an 8-way CPU machine with 16MB of RAM, the query was 'only' 2.6 times slower returning (3 runs in each case):

In 5.0: (5.0.46)
================
+------+--------------------+
| YEAR | MKT_SHARE          |
+------+--------------------+
| 1995 | 0.0712700707258018 |
| 1996 | 0.0612846196646767 |
+------+--------------------+
2 rows in set (1 min 33.80 sec)
2 rows in set (1 min 33.78 sec)
2 rows in set (1 min 33.76 sec)

In 5.1: (5.1.20)
================
+------+--------------------+
| YEAR | MKT_SHARE          |
+------+--------------------+
| 1995 | 0.0712700707258018 |
| 1996 | 0.0612846196646767 |
+------+--------------------+
2 rows in set (1 min 34.77 sec)
2 rows in set (1 min 35.02 sec)
2 rows in set (1 min 35.03 sec)

In 5.2 (5.2.4)
==============
+------+--------------------+
| YEAR | MKT_SHARE          |
+------+--------------------+
| 1995 | 0.0712700707258018 |
| 1996 | 0.0612846196646767 |
+------+--------------------+
2 rows in set (4 min 5.41 sec)
2 rows in set (4 min 7.43 sec)
2 rows in set (4 min 7.45 sec)

How to repeat:
For each of the server releases:
 - Start the server using perl ./mysql-test-run.pl --start-and-exit
 - Load the TPC-D database using
   mysql --user=root --port=9306 --protocol=tcp --max_allowed_packet=128M 
         < [dump_file](*)
 - start the mysql client
 - In the mysql client:
   = use test;
   = show tables;
     >> you should see a list of about 10 dss_xxx tables
   = create the above three views
   = run the above select query

(*) the dump file is located on /nfstmp1/qauser/tpc-d_db/tpcd_db.gz

Suggested fix:
Resolve the regression slow down
[10 Aug 2007 17:58] Omer Barnir
Quest benchmarks were run by Robin Schumacher <rschumacher@mysql.com>.
If you have any questions please let Robin or me know
[24 Aug 2007 13:54] Martin Hansson
This is indeed an optimizer bug, all that extra time is spent compiling the query. The plans are indentical except that 5.1 uses the join buffer in 9 places, while 5.2 never uses it only 3 times.
[15 Sep 2007 15:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/34315

ChangeSet@1.2599, 2007-09-15 19:05:18+04:00, sergefp@mysql.com +3 -0
  BUG#30363: complex query slower by 2.6x to 6x times in 5.2 then in 5.1
  The bugfix is a fix for the incorrect merge:
   - Make make_join_orderinfo() determine the last table that can use the 
     join buffering in exactly the same way as it is done in 
     make_join_readinfo() in current 5.1
   - Remove now unneeded code from make_join_readinfo()
[15 Dec 2007 8:30] Sergey Petrunya
The fix is available in 6.0.4 (no idea why push trigger didn't change the bug status)
[25 Jan 2008 19:20] Paul DuBois
Noted in 6.0.4 changelog.

Compared to MySQL 5.1, the 6.0 optimizer failed to use join buffering
for certain queries, resulting in slower performance for those queries.
[16 Aug 2010 6:37] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:13] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[20 Nov 2010 18:10] Paul DuBois
Noted in 5.6.1 changelog.