MySQL Bugs: #59326: Greedy optimizer produce stupid query execution plans.

Bug #59326	Greedy optimizer produce stupid query execution plans.
Submitted:	6 Jan 2011 10:10	Modified:	30 Jun 2011 15:02
Reporter:	Ole John Aske	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	5.6.99, 5.6.2, 5.5, 5.1 ->	OS:	Any
Assigned to:	Ole John Aske	CPU Architecture:	Any

Description:
The 'greedy' query plan optimizer fails to consider the size of intermediate query result when calculating the 'cost' of a query. This may result in slowly executing queries when there are much faster QEP's available.

When there are multiple JOIN orders available, it would normally be optimal to join the tables producing the smallest intermediate results first. This means that any EQ_REF should be joined before REF's, which in turn should be joined prior to full table scans (ALL).

The greedy optimizer fails to follow this simple optimization rule. It looks like the textual order of the tables in the query takes precedence over proper cost calculations.

How to repeat:
CREATE TABLE t10(
  K INT NOT NULL AUTO_INCREMENT,
  I INT,
  PRIMARY KEY(K)
);
INSERT INTO t10(I) VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(0);

CREATE TABLE t100 LIKE t10;
INSERT INTO t100(I)
SELECT X.I FROM t10 AS X,t10 AS Y;

CREATE TABLE t10000 LIKE t10;
INSERT INTO t10000(I)
SELECT X.I FROM t100 AS X, t100 AS Y;

Q1: (optimal QEP)
=================
EXPLAIN SELECT COUNT(*) FROM t10,t10000 X,t10000 Y
WHERE X.K=t10.I
  AND Y.I=t10.I;
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-------------------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref        | rows | Extra                                                 |
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-------------------------------------------------------+
|  1 | SIMPLE      | t10   | ALL    | NULL          | NULL    | NULL    | NULL       |   10 | Using where                                           |
|  1 | SIMPLE      | X     | eq_ref | PRIMARY       | PRIMARY | 4       | test.t10.I |    1 | Using index                                           |
|  1 | SIMPLE      | Y     | ALL    | NULL          | NULL    | NULL    | NULL       | 9961 | Using where; Using join buffer (BNL, regular buffers) |
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-------------------------------------------------------+
3 rows in set (0.00 sec)

We then swap the textual order of table X,Y, and would expect the
EQ_REF'ed 'X' still tp be joined prior to Y. This does not
happen and we get a really poor QEP:

Q2:
=====
EXPLAIN SELECT COUNT(*) FROM t10,t10000 Y,t10000 X
WHERE X.K=t10.I
  AND Y.I=t10.I;
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-------------------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref        | rows | Extra                                                 |
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-------------------------------------------------------+
|  1 | SIMPLE      | t10   | ALL    | NULL          | NULL    | NULL    | NULL       |   10 | Using where                                           |
|  1 | SIMPLE      | Y     | ALL    | NULL          | NULL    | NULL    | NULL       | 9961 | Using where; Using join buffer (BNL, regular buffers) |
|  1 | SIMPLE      | X     | eq_ref | PRIMARY       | PRIMARY | 4       | test.t10.I |    1 | Using index                                           |
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-------------------------------------------------------+
3 rows in set (0.00 sec)

A similar testcase are the queries:

EXPLAIN SELECT COUNT(*) FROM t10,t10000 X,t10000 Y
WHERE X.K=t10.I;

EXPLAIN SELECT COUNT(*) FROM t10,t10000 Y,t10000 X
WHERE X.K=t10.I;

Execution time for these queries are 0.08s / 0.91s

Verified with current mysql-trunk (valgrind build in virtual machine, to make difference in execution time more obvious :)

openxs@ubuntu:~/dbs/trunk$ bin/mysql --no-defaults -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.6.2-m5-valgrind-max-debug Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE TABLE t10(
    ->   K INT NOT NULL AUTO_INCREMENT,
    ->   I INT,
    ->   PRIMARY KEY(K)
    -> );
Query OK, 0 rows affected (0.85 sec)

mysql> INSERT INTO t10(I) VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(0);
Query OK, 10 rows affected (0.63 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> CREATE TABLE t100 LIKE t10;
Query OK, 0 rows affected (0.37 sec)

mysql> INSERT INTO t100(I)
    -> SELECT X.I FROM t10 AS X,t10 AS Y;
Query OK, 100 rows affected (1.18 sec)
Records: 100  Duplicates: 0  Warnings: 0

mysql> CREATE TABLE t10000 LIKE t10;
Query OK, 0 rows affected (0.27 sec)

mysql> INSERT INTO t10000(I)
    -> SELECT X.I FROM t100 AS X, t100 AS Y;
Query OK, 10000 rows affected (1 min 1.17 sec)
Records: 10000  Duplicates: 0  Warnings: 0

mysql> EXPLAIN SELECT COUNT(*) FROM t10,t10000 X,t10000 Y
    -> WHERE X.K=t10.I
    ->   AND Y.I=t10.I;
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-----------------------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref        | rows | Extra                                                     |
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-----------------------------------------------------------+
|  1 | SIMPLE      | t10   | ALL    | NULL          | NULL    | NULL    | NULL       |   10 | Using where                                               |
|  1 | SIMPLE      | X     | eq_ref | PRIMARY       | PRIMARY | 4       | test.t10.I |    1 | Using index                                               |
|  1 | SIMPLE      | Y     | ALL    | NULL          | NULL    | NULL    | NULL       | 8942 | Using where; Using join buffer (BNL, incremental buffers) |
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-----------------------------------------------------------+
3 rows in set (1.84 sec)

mysql> EXPLAIN SELECT COUNT(*) FROM t10,t10000 Y,t10000 X
    -> WHERE X.K=t10.I
    ->   AND Y.I=t10.I;
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-----------------------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref        | rows | Extra                                                     |
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-----------------------------------------------------------+
|  1 | SIMPLE      | t10   | ALL    | NULL          | NULL    | NULL    | NULL       |   10 | Using where                                               |
|  1 | SIMPLE      | Y     | ALL    | NULL          | NULL    | NULL    | NULL       | 8942 | Using where; Using join buffer (BNL, incremental buffers) |
|  1 | SIMPLE      | X     | eq_ref | PRIMARY       | PRIMARY | 4       | test.t10.I |    1 | Using index                                               |
+----+-------------+-------+--------+---------------+---------+---------+------------+------+-----------------------------------------------------------+
3 rows in set (0.06 sec)

mysql> SELECT COUNT(*) FROM t10,t10000 X,t10000 Y WHERE X.K=t10.I   AND Y.I=t10.I;
+----------+
| COUNT(*) |
+----------+
|     9000 |
+----------+
1 row in set (14.77 sec)

mysql> SELECT COUNT(*) FROM t10,t10000 Y,t10000 X WHERE X.K=t10.I   AND Y.I=t10.I;
+----------+
| COUNT(*) |
+----------+
|     9000 |
+----------+
1 row in set (1 min 13.36 sec)

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/128070

3534 Ole John Aske	2011-01-06
      Fix for bug#59326: Greedy optimizer produce stupid query execution plans.
      
      The 'greedy' query plan optimizer fails to consider the size of
      intermediate query result when calculating the accumulated
      'cost' of a query. This may result in slowly executing queries
      when there are much faster QEP's available.
      
      When there are multiple JOIN orders available, it would normally be
      optimal to join the tables producing the smallest intermediate results
      first. This means that any EQ_REF should be joined before REF's, which
      in turn should be joined prior to full table scans (ALL).
      
      This fix add the 'CPU-cost' of processing 'current_record_count' records at
      each level to 'current_read_time' *before* it is used as 'accumulated cost' 
      argument to recursive best_extension_by_limited_search() calls. This ensures 
      that the cost of a huge join-fanout early in the QEP is correctly reflected 
      in the cost of the final QEP.
      
      Several new MTR tests has been added to greedy_optimizer.test to ensure that
      QEP plans joins the tables in the order EQ_REF, REF, ALL whenever possible.
      
      Also updated result files for several test where the QEP or 'query_cost'
      changes due to this fix. The changed query plans has been verified to be an
      improvement over the original QEPs.
     @ mysql-test/r/greedy_optimizer.result
        Several QEP changed (improved) due to changes in greedy optimizer.
        Generally EQ_REF and REF's are now executed prior to table scans(ALL)
        whenever possible.
        
        Added result for new testcases
     @ mysql-test/r/select.result
        Changed order of result rows due to changed QEP
     @ mysql-test/r/subselect.result
        Changed (improved) QEP.

This bug has a *huge* performance impact on Cluster, and likely any other storage engine which need RPC calls to access the storage engines.

Taking the following testcase as an example:

EXPLAIN SELECT COUNT(*) FROM t10,t10000 X,t10000 Y
WHERE X.K=t10.I;
EXPLAIN SELECT COUNT(*) FROM t10,t10000 Y,t10000 X
WHERE X.K=t10.I;

These queries executes in 0.04s / 7.49s depending on whether 
the EQ_REF(X) is done before or after the table scan on 'Y'.

I ran these queries on optimizer team tree, where it is possible to switch join buffering off altogether. I get an interesting result for one of the queries, where using join buffering increases execution time significantly (0.92 seconds with join buffering, 0.23 seconds without):

set optimizer_join_cache_level=1;

SELECT COUNT(*) FROM t10,t10000 X,t10000 Y WHERE X.K=t10.I AND Y.I=t10.I;

1 row in set (0.03 sec)

SELECT COUNT(*) FROM t10,t10000 Y,t10000 X WHERE X.K=t10.I AND Y.I=t10.I;

1 row in set (0.03 sec)

SELECT COUNT(*) FROM t10,t10000 X,t10000 Y WHERE X.K=t10.I;

1 row in set (0.05 sec)

SELECT COUNT(*) FROM t10,t10000 Y,t10000 X WHERE X.K=t10.I;

1 row in set (0.92 sec)

set optimizer_join_cache_level=0;

SELECT COUNT(*) FROM t10,t10000 X,t10000 Y WHERE X.K=t10.I AND Y.I=t10.I;

1 row in set (0.10 sec)

SELECT COUNT(*) FROM t10,t10000 Y,t10000 X WHERE X.K=t10.I AND Y.I=t10.I;

1 row in set (0.12 sec)

SELECT COUNT(*) FROM t10,t10000 X,t10000 Y WHERE X.K=t10.I;

1 row in set (0.13 sec)

SELECT COUNT(*) FROM t10,t10000 Y,t10000 X WHERE X.K=t10.I;

1 row in set (0.23 sec)

The proposed fix for this bug also fix bug#41740 MySQL 5.1 optimizer produces inefficient plan for complex queries.

Also fix, or greatly improves, bug#58225: 'Extreme performance degradation with joins'

Amazing, there seems to be another serious bug in how the best execution plan is selected. Using the testcase from bug#58225, and inspecting the reported
Last_query_cost (show status where Variable_name like '%cost%';) for different settings of 'optimizer_search_depth' we observe:

optimizer_search_depth    Last_query_cost w/ fix     0                              226.39
1                             7037.99
2                           371280.99
3                            32389.59
5                              505.99
8                               99.19
10                             103.99
12                             108.79
15                          <took to long>

For small numbers of 'optimizer_search_depth' the cost diverges when we
do a deeper search!

Looking at the code where the 'best' plan is aggregated:
.......

      { /*
          'join' is either the best partial QEP with 'search_depth' relations,
          or the best complete QEP so far, whichever is smaller.
        */
        ... Cut ..
>>>>    if ((search_depth == 1) || (current_read_time < join->best_read))
        {
          memcpy((uchar*) join->best_positions, (uchar*) join->positions,
                 sizeof(POSITION) * (idx + 1));
          join->best_read= current_read_time - 0.001;
        }

best_read is not only updated when 'current_read_time < join->best_read' but also every time we reach the max 'search_depth'! - EVEN OF THE PLAN AT THIS MOMENT MAY HAVE A HIGHER COST THAN 'best'!

Fixed it with this patch:
@@ -5401,7 +5457,7 @@
             join->positions[join->const_tables].table->table)
           /* We have to make a temp table */
           current_read_time+= current_record_count;
-        if ((search_depth == 1) || (current_read_time < join->best_read))
+        if (current_read_time < join->best_read)
         {
           memcpy((uchar*) join->best_positions, (uchar*) join->positions,
                  sizeof(POSITION) * (idx + 1));

Then rerun the test above:

optimizer_search_depth    Last_query_cost w/ fix      wo/fix    0                              84.79                  226.39
1                              78.79                 7037.99
2                              88.59               371280.99
3                              82.39                32389.59
5                              83.59                  505.99
8                              85.39                   99.19
10                             86.59                  103.99
12                             87.79                  108.79
15                             89.59                <aborted>

Hmm. interesting, a huge improvement in behaviour !

I assume that the reason for the 'cost' not being strictly converging is the cost pruning logic which may exclude some execution plans.

I also noticed a huge reduction in optimization time for the query with this fix. Taking 'search_depth=12' as a case the time was reduced from 31.11s to 1.88s.
The reason for this improvement is that we are able to cost prune more query plans when we have a steady & low 'best_cost'

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/129557

3548 Ole John Aske	2011-01-25
      Extend greedy_optimizer.test creating a new baseline before commiting fix for bug#59326.
      
      Test has been enhanced to both explain the optimized query plan *and*
      execute it and report total number of 'Handler_reads%' during
      execution.
      
      Handler_reads is believed to be a fairly good metric on how optimal the
      query plan was.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/129562

3549 Ole John Aske	2011-01-25
      Fix for bug#59326: 'Greedy optimizer produce stupid query execution plans.'
      
      Fixed several defects in the greedy optimization:
      
      1) The greedy optimizer calculate the 'compare-cost' (CPU-cost)
         for iterating over the partial plan result at each level in
         the query plan as 'record_count / (double) TIME_FOR_COMPARE'
      
         This cost was only used localy for 'best' calculation at each
         level, and *not* accumulated into the total cost for the query plan.
      
         This fix add the 'CPU-cost' of processing 'current_record_count' records at
         each level to 'current_read_time' *before* it is used as 'accumulated cost' 
         argument to recursive best_extension_by_limited_search() calls. This ensures 
         that the cost of a huge join-fanout early in the QEP is correctly reflected 
         in the cost of the final QEP.
      
         To get identical cost for a 'best' optimized query and a straight_join with the
         same join order, the same change was also applied to optimize_straight_join()
         and get_partial_join_cost()
      
      2) Furthermore to get equal cost for 'best' optimized query and a straight_join
         we had to subtract the same '0.001' in optimize_straight_join() as we already
         do in best_extension_by_limited_search()
      
      3) When best_extension_by_limited_search() aggregated the 'best' plan a plan
         was 'best' by the check :
      
           'if ((search_depth == 1) || (current_read_time < join->best_read))'
      
         The term '(search_depth == 1' incorrectly caused a new best plan to be 
         collected whenever we reached the specified 'search_depth' - Even if this
         partial query plan was more expensive than what we had already found.
         ... See further comment and pure numbers for this in bugreport.
     @ mysql-test/include/check_qep.inc
        Compare execution of this query against a known 'best' plan for this query
     @ mysql-test/include/expect_qep.inc
        Collect expected statistic for query - later used to compare exec. of queries expeted to behave identical
     @ mysql-test/r/greedy_optimizer.result
        Updated resultfile after greedy optimizer changes. Don't care about increased 'cost' as this
        is a relative number.
        
        However, several 'Total_handler_reads' has decreased, and none increased - Which should 
        be a fairly good indication of this fix improving the existing query plan.
        
        Also added a lots of new tests which will fail wo/ this fix.
     @ mysql-test/r/join.result
        Accepted udated 'cost'
     @ mysql-test/r/join_cache_jcl1.result
        Accepted new query plan which looks better as ALL is joined in later. (Decreased fanout)
     @ mysql-test/r/join_cache_jcl2.result
        Accepted new query plan which looks better as ALL is joined in later. (Decreased fanout)
     @ mysql-test/r/join_cache_jcl3.result
        Accepted new query plan which looks better as ALL is joined in later. (Decreased fanout)
     @ mysql-test/r/join_cache_jcl4.result
        Accepted new query plan which looks better as ALL is joined in later. (Decreased fanout)
     @ mysql-test/r/select_icp_mrr.result
        Same result, another order in resultset (No ORDER BY here)
     @ mysql-test/r/select_none.result
        Same result, another order in resultset (No ORDER BY here)
     @ mysql-test/r/status.result
        Accepted new cost
     @ mysql-test/r/subquery_sj_none.result
        ALL/REF -joining tables with most rows later improved the query plan (Decreased fanout).
        Same for doing REF after EQ_REF in the last query (decreased fanout)
     @ mysql-test/t/greedy_optimizer.test
        Added a lots of new tests which will fail wo/ this fix.

Note to reviewer / QA-test:
--------------------------

There will be more fixes related to this bug which are adendums to the one already commited. These upcoming fixes are mostly related to the problems 
reported as bug#41740 & bug#58225 but will be commited as part of this fix
as they partly overlaps with the fix already commited.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

http://lists.mysql.com/commits/130054

3550 Ole John Aske 2011-01-31
Addendum patch for bug#59326 'Greedy optimizer produce stupid query execution plans'

In order for the greedy optimizers 'prune' logic to quickly find a 'good'
execution plan, and prune the other less promissing plans, best_ref[]
is sorted by E(#rows) before we start calculating query plans.

However, due to how swap_variables() was used inside
best_extension_by_limited_search(), best_ref[] quickly
became 'scrambled', and was no longer sorted when
multiple partial plans had been evaluated.

This changeset maintains the order of the unevaluated part of best_ref[].
Besides reducing time to find the 'best' query plan, this also
reduces the risk for incorrectly pruning away the optimal query plan.

This fix also include 'TIME_FOR_COMPARE' as part of the 'cost' considdered
when less promissing query plans
- This 'compare cost' is included everywhere else in the best-calculations.
- Try to run the extended 'greedy_optimizer' test wo/ and it demonstrate
how it fails to prune 'bad' plans -> and likely tiemouts.
@ mysql-test/r/select_icp_mrr.result
Accepted new order of (unordered) result.
@ mysql-test/r/select_none.result
Accepted new order of (unordered) result.
@ mysql-test/t/greedy_optimizer.test
Extended greedy test with 'large' joins which previously took 'forever' to explain.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/130055

3551 Ole John Aske	2011-01-31
      Refactoring cleanup related to bug#59326 and my previous set of fixes for this bug:
      
      - Added 'TIME_FOR_COMPARE' to current_read_time where 
        it is initially calculated - Removed later adding of 
        TIME_FOR_COMPARE whenever it is used.
      
      - Used local variable '*position' instead of 'join->positions + idx'
        a few places.
      
      - Replaced a few references to 'restore_prev_nj_state()' 
        with 'backout_nj_sj_state()' in comments (Has been renamed)
      
      This fix contains no (intentional) changes of logic.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/130112

3552 Ole John Aske	2011-02-01
      Another addendum fix related to bug#59326, and particularly 
      its relatives/duplicates bug#41740 and bug#58225.
      
      Fixes the problem that complex queries containing a lot of EQ_REF'ed 
      joined tables takes 'forever' to produce a query plan.
      
      
      The fix depends on the fact that when a table is
      joined by an unique key there is a 1::1 relation
      between the rows being joined. Assuming we have multiple
      such 1::1 (star-)joined relations in a sequence, 
      without other join types inbetween. Then all of 
      these 'eq_ref-joins' will be calculated to return the excact 
      same #rows and having identical 'cost' (or 'read_time').
      
      This leads to that we can append such a contigous sequence
      of eq_ref-joins to a partial plan in any order without 
      affecting the total cost of the query plan. Exploring the
      different permutations of these eq_refs in the 'greedy' 
      optimizations will simply be a waste of precious CPU cycles.
      
      This chain of eq_refs can further be handled as a single
      entity wrt. the full 'greedy' exploration of the possible
      join plans. This will reduce the 'N' in the O(N!) complexity
      of the full greedy search.
      
      The function eq_ref_extension_by_limited_search() has been 
      implemented doing as described above. Once an EQ_REF joined 
      table is encountered, it is invoked to append any remaining 
      EQ_REFs to the join plan. This chain of EQ_REFs will then 
      act like a single entity in the further exploration of possible
      join plans.
      
      It should be noted that best_extension_by_limited_search()
      will still explore the other plans starting with non-EQ_REF 
      operations as possibly better alternatives to the EQ_REF-extended 
      query plan.
      
      Usage of eq_ref_extension_by_limited_search() is 
      controlled by 'optimizer_prune_level=1'. 
      
      This fix also introduce the function plan_is_complete() which
      contains common functionality between best_extension_by_limited_search()
      and eq_ref_extension_by_limited_search()
      
      greedy_optimizer.test has been extensively extended with new testcases 
      containing lots of EQ_REF's.
      
      No existing test/query plans are expected to change as result of this fix.
      It should only produce the 'best' plan (a lot!) faster. - Like the 
      testcase for bug#58225 which now completes the QEP in ~0.05s in my
      DEBUG compiled sandbox (With 'optimizer_search_depth= 62'). This
      previously took  63min+42.437s :-o

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/130114

3553 Ole John Aske	2011-02-01
      Last addendum fix related to bug#59326, and particularly
      its relatives/duplicates bug#41740 and bug#58225. 
      
      If a filesort is required by the final result, we can include
      the cost for the filesort in the pruning of 'bad' plans, and
      possible prune the plan at an earlier stage than today.
      
      NOTE1: The 'sort_cost' is calculated with logic similar
             to the one in plan_is_complete() which adds the sort_cost
             to the best plan.
      
      NOTE2: The 'sort_cost' is calculated as '= current_record_count'.
             This is not entirely correct when used in the pruning logic
             as we do not yet #rows in the *final* result which is what 
             we have to sort. However, it should be a safe assumption that
             'final #rows' >= 'intermediat #rows'. So this prune-calculation
             should always be on the safe side.
      
      NOTE3: Calculating 'sort_cost= current_record_count' is a bad
             estimate as most sort algorithms are O(n*log(n)).
      
      This fix reduce the query compile time for the infamous bug#58225 problem
      further from 50ms to ~20ms

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/130125

3584 Ole John Aske	2011-02-01
      Collection of all fixes for bug#59326 and its relatives/duplicates bug#41740 and bug#58225.
      
      Commited to the branch 'mysql-trunk-opt-bug59326' on request from QA.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/130178

3587 Ole John Aske	2011-02-02
      Collection of all fixes for bug#59326 and its relatives/duplicates bug#41740 and bug#58225.
      
      Commited to the branch 'mysql-trunk-opt-bug59326' on request from QA.

Noted in 5.1.51-ndb-7.0.22, 5.6.2 changelogs.

The "greedy" query plan optimizer failed to consider the size of
intermediate query results when calculating the cost of a query. This
could result in slowly executing queries when there are much faster
execution plans available. 

CHANGESET - http://lists.mysql.com/commits/131915