MySQL Bugs: #41740: MySQL 5.1 optimizer produces inefficient plan for complex queries

Bug #41740	MySQL 5.1 optimizer produces inefficient plan for complex queries
Submitted:	25 Dec 2008 8:51	Modified:	16 Aug 2011 1:21
Reporter:	Valeriy Kravchuk	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S5 (Performance)
Version:	5.1.25,5.1.30	OS:	Any
Assigned to:	Ole John Aske	CPU Architecture:	Any
Tags:	regression

Description:
For complex queries like this:

SELECT path_idx, path_idx, margin, length, path_delay, skew, paths.type_code,
CONCAT_WS(' ',ROUND(seg1.delay,3),CONCAT('A',CASE seg1.event_reference_clock_trans WHEN 'H2L' THEN 'F' WHEN 'Z2L' THEN 'F' WHEN 'L2H' THEN 'R' WHEN 'Z2H' THEN 'R' ELSE seg1.event_reference_clock_trans END),clk1.name),
CONCAT_WS(' ',ROUND(seg2.delay,3),CONCAT('A',CASE seg2.event_reference_clock_trans WHEN 'H2L' THEN 'F' WHEN 'Z2L' THEN 'F' WHEN 'L2H' THEN 'R' WHEN 'Z2H' THEN 'R' ELSE seg2.event_reference_clock_trans END),clk2.name),
CONCAT_WS(' ',ROUND(check_value - clk_delay,3),CONCAT('B',CASE clk_tran WHEN 'H2L' THEN 'F' WHEN 'Z2L' THEN 'F' WHEN 'L2H' THEN 'R' WHEN 'Z2H' THEN 'R' ELSE clk_tran END),clk3.name),
CASE IFNULL(last_clocked_pin_trans,start_pin_trans) WHEN 'H2L' THEN 'F' WHEN 'Z2L' THEN 'F' WHEN 'L2H' THEN 'R' WHEN 'Z2H' THEN 'R' ELSE IFNULL(last_clocked_pin_trans,start_pin_trans) END,
p1.name,
CASE end_pin_trans WHEN 'H2L' THEN 'F' WHEN 'Z2L' THEN 'F' WHEN 'L2H' THEN 'R' WHEN 'Z2H' THEN 'R' ELSE end_pin_trans END,
p2.name, IFNULL(n3.name,'evr'), IFNULL(t1.name,'evr'), IFNULL(t2.name,'evr')
FROM paths, pins as p1, pins as p2, nets as n1, nets as n2, 
path_segments as seg1, path_segments as seg2, clocks as clk1, clocks as clk2,
checks, clocks as clk3, timing_nodes, nets as n3, cells as c1,
cells as c2, cell_templates as t1, cell_templates as t2 
WHERE paths.mode='l' 
AND path_idx IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23) AND IFNULL(last_clocked_pin_idx,start_pin_idx)=p1.pin_idx 
AND end_pin_idx=p2.pin_idx 
AND IFNULL(last_clocked_pin_net_idx,start_pin_net_idx)=n1.net_idx 
AND end_pin_net_idx=n2.net_idx 
AND IFNULL(last_clocked_seg_idx,start_seg_idx)=seg1.seg_idx 
AND end_seg_idx=seg2.seg_idx 
AND paths.check_idx=checks.check_idx 
AND seg1.event_reference_clock_idx=clk1.clock_idx 
AND seg2.event_reference_clock_idx=clk2.clock_idx 
AND clk_idx=clk3.clock_idx 
AND ref_node_idx=timing_nodes.node_idx 
AND timing_nodes.net_idx=n3.net_idx 
AND p1.owner_cell_idx=c1.cell_idx 
AND c1.owner_lib_temp_idx=t1.template_idx 
AND p2.owner_cell_idx=c2.cell_idx 
AND c2.owner_lib_temp_idx=t2.template_idx 
ORDER BY (path_idx)

optimizer of MySQL 5.1 produce an inefficient execution plan:

1 SIMPLE clk1 ALL PRIMARY {} {} {} 2 {Using temporary; Using filesort}
1 SIMPLE clk2 ALL PRIMARY {} {} {} 2 {Using join buffer}
1 SIMPLE clk3 ALL PRIMARY {} {} {} 2 {Using join buffer}
1 SIMPLE paths ALL PRIMARY,check_idx,end_seg_idx,end_pin_idx,end_pin_net_idx {} {} {} 23 {Using where; Using join buffer}
1 SIMPLE n1 eq_ref PRIMARY PRIMARY 4 func 1 {Using where; Using index}
1 SIMPLE n2 eq_ref PRIMARY PRIMARY 4 timing_db.paths.end_pin_net_idx 1 {Using index}
1 SIMPLE checks eq_ref PRIMARY,clk_idx,FK_CHECKS_REF_NODE_IDX PRIMARY 4 timing_db.paths.check_idx 1 {Using where}
1 SIMPLE timing_nodes eq_ref PRIMARY,net_idx PRIMARY 4 timing_db.checks.ref_node_idx 1 {}
1 SIMPLE n3 eq_ref PRIMARY PRIMARY 4 timing_db.timing_nodes.net_idx 1 {}
1 SIMPLE p1 eq_ref PRIMARY,FK_PINS_OWNER_CELL_IDX PRIMARY 4 func 1 {Using where}
1 SIMPLE c1 eq_ref PRIMARY PRIMARY 4 timing_db.p1.owner_cell_idx 1 {}
1 SIMPLE t1 eq_ref PRIMARY PRIMARY 8 timing_db.c1.owner_lib_temp_idx 1 {Using where}
1 SIMPLE p2 eq_ref PRIMARY,FK_PINS_OWNER_CELL_IDX PRIMARY 4 timing_db.paths.end_pin_idx 1 {}
1 SIMPLE c2 eq_ref PRIMARY PRIMARY 4 timing_db.p2.owner_cell_idx 1 {}
1 SIMPLE t2 eq_ref PRIMARY PRIMARY 8 timing_db.c2.owner_lib_temp_idx 1 {Using where}
1 SIMPLE seg1 eq_ref PRIMARY,event_reference_clock_idx PRIMARY 4 func 1 {Using where}
1 SIMPLE seg2 eq_ref PRIMARY,event_reference_clock_idx PRIMARY 4 timing_db.paths.end_seg_idx 1 {Using where}

The problem is not only in accessing entire paths table 8 times instead of one. Query is executed dozens times slower than on 5.0.x with the following plan: 

1 SIMPLE paths ALL PRIMARY,check_idx,end_seg_idx,end_pin_idx,end_pin_net_idx {} {} {} 23 {Using where; Using filesort}
1 SIMPLE n1 eq_ref PRIMARY PRIMARY 4 func 1 {Using where; Using index}
1 SIMPLE n2 eq_ref PRIMARY PRIMARY 4 timing_db.paths.end_pin_net_idx 1 {Using index}
1 SIMPLE checks eq_ref PRIMARY,clk_idx,FK_CHECKS_REF_NODE_IDX PRIMARY 4 timing_db.paths.check_idx 1 {}
1 SIMPLE clk3 eq_ref PRIMARY PRIMARY 4 timing_db.checks.clk_idx 1 {}
1 SIMPLE timing_nodes eq_ref PRIMARY,net_idx PRIMARY 4 timing_db.checks.ref_node_idx 1 {}
1 SIMPLE n3 eq_ref PRIMARY PRIMARY 4 timing_db.timing_nodes.net_idx 1 {}
1 SIMPLE p1 eq_ref PRIMARY,FK_PINS_OWNER_CELL_IDX PRIMARY 4 func 1 {Using where}
1 SIMPLE c1 eq_ref PRIMARY PRIMARY 4 timing_db.p1.owner_cell_idx 1 {}
1 SIMPLE t1 eq_ref PRIMARY PRIMARY 8 timing_db.c1.owner_lib_temp_idx 1 {Using where}
1 SIMPLE p2 eq_ref PRIMARY,FK_PINS_OWNER_CELL_IDX PRIMARY 4 timing_db.paths.end_pin_idx 1 {}
1 SIMPLE c2 eq_ref PRIMARY PRIMARY 4 timing_db.p2.owner_cell_idx 1 {}
1 SIMPLE t2 eq_ref PRIMARY PRIMARY 8 timing_db.c2.owner_lib_temp_idx 1 {Using where}
1 SIMPLE seg1 eq_ref PRIMARY,event_reference_clock_idx PRIMARY 4 func 1 {Using where}
1 SIMPLE clk1 eq_ref PRIMARY PRIMARY 4 timing_db.seg1.event_reference_clock_idx 1 {}
1 SIMPLE seg2 eq_ref PRIMARY,event_reference_clock_idx PRIMARY 4 timing_db.paths.end_seg_idx 1 {}
1 SIMPLE clk2 eq_ref PRIMARY PRIMARY 4 timing_db.seg2.event_reference_clock_idx 1 {}

(6 sec vs. 0.13, for example), and most of that time is spent on optimization (as EXPLAIN alone works for 6 seconds). This is a serious performance degradation that prevents upgrade.

All tables are MyISAM. ANALYZE/OPTIMIZE TABLE for all of them does not help. Query returns only 23 rows.

The only way to force proper execution plan is to force exact join order with STRAIGHT_JOINs.

How to repeat:
Run the query on tables provided (see private comment).

Suggested fix:
Make optimizer in 5.1 to work in the same way as in 5.0. Actually, there is a simple workaround:

set session optimizer_search_depth=0;

With this setting exactly the same plan as in 5.0 is produced almost instantly, and the query is executed really fast. So, maybe 0 should be used by default, not 62.

Extracted and simplified test case

Attachment: 41740.test (application/octet-stream, text), 12.54 KiB.

Analysis of the problem:

This bug is a regression of bug #14940.

As a result of this regression EXPLAIN command/JOIN::optimize phase is few times slower than before. This is probably ok because of deeper query analysis, however it chooses an execution plan with a worse cost, for example:

before the regression: Last_query_cost 149.303370
after the regression:  Last_query_cost 528.292652

Fortunately a workaround exists:  

set session optimizer_search_depth=0;

Hi!

It's very strange that we do deeper query analysis (=consider a proper superset QEPs we used to consider before)  but end up choosing a plan with a worse cost! We need to find this out.

Gleb, could you please investigate further, until it's clear what's the problem?

Oops, missed the "This bug is a regression of bug #14940" part.

Bug #58225 is probably a duplicate of this one.

With 3 customers affected this is more like I2. Also, as

set session optimizer_search_depth=0;

is a partial workaround, please, consider making this (not 62) a default value (for 5.5 maybe). I'd ask for new E/R values for this change.

The proposed (one-liner) fix for Bug#59326 'Greedy optimizer produce stupid query execution plans.' will fix this problem.

Guess I have to explain a bit more here:

The bug will be 'fixed' in the sense that an optimim
execution plan is produced even with 
'set session optimizer_search_depth=62'.
However, producing this QEP still takes a long 
time (~9s w/ fix, compared with ~11s wo fix)

This was tested with the 'Extracted and simplified 
case, from '25 Feb 2009 16:14'

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/130112

3552 Ole John Aske	2011-02-01
      Another addendum fix related to bug#59326, and particularly 
      its relatives/duplicates bug#41740 and bug#58225.
      
      Fixes the problem that complex queries containing a lot of EQ_REF'ed 
      joined tables takes 'forever' to produce a query plan.
      
      
      The fix depends on the fact that when a table is
      joined by an unique key there is a 1::1 relation
      between the rows being joined. Assuming we have multiple
      such 1::1 (star-)joined relations in a sequence, 
      without other join types inbetween. Then all of 
      these 'eq_ref-joins' will be calculated to return the excact 
      same #rows and having identical 'cost' (or 'read_time').
      
      This leads to that we can append such a contigous sequence
      of eq_ref-joins to a partial plan in any order without 
      affecting the total cost of the query plan. Exploring the
      different permutations of these eq_refs in the 'greedy' 
      optimizations will simply be a waste of precious CPU cycles.
      
      This chain of eq_refs can further be handled as a single
      entity wrt. the full 'greedy' exploration of the possible
      join plans. This will reduce the 'N' in the O(N!) complexity
      of the full greedy search.
      
      The function eq_ref_extension_by_limited_search() has been 
      implemented doing as described above. Once an EQ_REF joined 
      table is encountered, it is invoked to append any remaining 
      EQ_REFs to the join plan. This chain of EQ_REFs will then 
      act like a single entity in the further exploration of possible
      join plans.
      
      It should be noted that best_extension_by_limited_search()
      will still explore the other plans starting with non-EQ_REF 
      operations as possibly better alternatives to the EQ_REF-extended 
      query plan.
      
      Usage of eq_ref_extension_by_limited_search() is 
      controlled by 'optimizer_prune_level=1'. 
      
      This fix also introduce the function plan_is_complete() which
      contains common functionality between best_extension_by_limited_search()
      and eq_ref_extension_by_limited_search()
      
      greedy_optimizer.test has been extensively extended with new testcases 
      containing lots of EQ_REF's.
      
      No existing test/query plans are expected to change as result of this fix.
      It should only produce the 'best' plan (a lot!) faster. - Like the 
      testcase for bug#58225 which now completes the QEP in ~0.05s in my
      DEBUG compiled sandbox (With 'optimizer_search_depth= 62'). This
      previously took  63min+42.437s :-o

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/130114

3553 Ole John Aske	2011-02-01
      Last addendum fix related to bug#59326, and particularly
      its relatives/duplicates bug#41740 and bug#58225. 
      
      If a filesort is required by the final result, we can include
      the cost for the filesort in the pruning of 'bad' plans, and
      possible prune the plan at an earlier stage than today.
      
      NOTE1: The 'sort_cost' is calculated with logic similar
             to the one in plan_is_complete() which adds the sort_cost
             to the best plan.
      
      NOTE2: The 'sort_cost' is calculated as '= current_record_count'.
             This is not entirely correct when used in the pruning logic
             as we do not yet #rows in the *final* result which is what 
             we have to sort. However, it should be a safe assumption that
             'final #rows' >= 'intermediat #rows'. So this prune-calculation
             should always be on the safe side.
      
      NOTE3: Calculating 'sort_cost= current_record_count' is a bad
             estimate as most sort algorithms are O(n*log(n)).
      
      This fix reduce the query compile time for the infamous bug#58225 problem
      further from 50ms to ~20ms

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/130125

3584 Ole John Aske	2011-02-01
      Collection of all fixes for bug#59326 and its relatives/duplicates bug#41740 and bug#58225.
      
      Commited to the branch 'mysql-trunk-opt-bug59326' on request from QA.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/130178

3587 Ole John Aske	2011-02-02
      Collection of all fixes for bug#59326 and its relatives/duplicates bug#41740 and bug#58225.
      
      Commited to the branch 'mysql-trunk-opt-bug59326' on request from QA.

Noted in 5.6.3 changelog. 

For queries with many eq_ref joins, the optimizer took excessive time 
to develop an execution plan.