Bug #26528 Optimizer produces inefficient execution plan on 5.0.x compared to 4.1.21
Submitted: 21 Feb 2007 15:02 Modified: 22 Feb 2007 16:21
Reporter: Valeriy Kravchuk Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0.34, 5.0.27 OS:Windows (Windows)
Assigned to: Igor Babaev CPU Architecture:Any
Tags: regression

[21 Feb 2007 15:02] Valeriy Kravchuk
Description:
Some complex queries with derived tables works 3+ times slower on 5.0.27 compared to 4.1.21 (with the same default my.ini settings, same hardware and on same data). For example, this one:

select count(*) as n,im,rt,tc 
from 
  (select R.type_con as rt ,SE.imex as im, H.Lib_mvt, 
          R.type_con,R.contrat as tc, 
          TY.fr, H.Type_mvt, H.TypeP, H.TypeD, H.Pv,
          H.etat, H.date_mvt, H.Id_sspP,
          CONCAT(H.Type_mvt, H.TypeP, H.TypeD,H.Pv), 
          SE.imex , H.Id_conteneur, H.Id_sspD 
     from mft_historique_gilles as H 
          INNER JOIN 
          (mft_ref as R, mft_passeport as P, mft_sequence SE, mft_typemvt as TY)
          ON ((P.Mft_refP=R.mft_ref) 
              AND (H.Id_conteneur = P.Id_conteneur ) 
              AND Ty.type=SE.imex 
              AND CONCAT(H.Type_mvt, H.TypeP, H.TypeD, H.Pv) = SE.sequencec
             )
          left outer join
          (mft_refhisto as RE) 
          ON ((Re.id_mouvement=H.Id_mouvement)) 
     where (((H.numeroD="4499" ) OR ( Re.portD="FRLEH" )) AND SE.imex !="")
     group by H.id_conteneur) as bob 
group by im, rt 
order by rt,tc

runs 13+ seconds on 4.1.21 and 43+ seconds (with 100% CPU load) on 5.0.27 on the same data.

ANALYZE TABLE for all the tables involved does not change query plan.

How to repeat:
C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot test -P 3307
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 11
Server version: 4.1.21-community-nt

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> use marfret
Database changed
mysql> select count(*) as n,im,rt,tc from (select R.type_con as rt ,SE.imex as im
...
    -> group by H.id_conteneur) as bob group by im,rt order by rt,tc
    -> ;
...
33 rows in set (13.72 sec)

mysql> explain select count(*) as n,im,rt,tc from (select R.type_con as rt, SE.imex as im
...
    -> group by H.id_conteneur) as bob group by im,rt order by rt,tc\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 464
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 2
  select_type: DERIVED
        table: H
         type: ALL
possible_keys: idcont,numeroD,index1,index2
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 35482
        Extra: Using temporary; Using filesort
*************************** 3. row ***************************
           id: 2
  select_type: DERIVED
        table: P
         type: ref
possible_keys: idetat,mft_refP,ls_conteneur,Id,index7
          key: idetat
      key_len: 11
          ref: marfret.H.Id_conteneur
         rows: 1
        Extra:
*************************** 4. row ***************************
           id: 2
  select_type: DERIVED
        table: R
         type: eq_ref
possible_keys: mft_ref,ref_contrat,index10
          key: mft_ref
      key_len: 50
          ref: marfret.P.Mft_refP
         rows: 1
        Extra:
*************************** 5. row ***************************
           id: 2
  select_type: DERIVED
        table: TY
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 37
        Extra:
*************************** 6. row ***************************
           id: 2
  select_type: DERIVED
        table: RE
         type: ref
possible_keys: index1,idmvtportdtyperef,idmvt,stock
          key: index1
      key_len: 4
          ref: marfret.H.Id_mouvement
         rows: 1
        Extra: Using where; Using index
*************************** 7. row ***************************
           id: 2
  select_type: DERIVED
        table: SE
         type: ref
possible_keys: seqc,seqcp,imex,seqc_imex
          key: seqc_imex
      key_len: 266
          ref: func,marfret.TY.type
         rows: 2
        Extra: Using where; Using index
7 rows in set (8.13 sec)

Note the order of table access! Now, with the same data:

C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -P 3308
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 15
Server version: 5.0.27-community-nt MySQL Community Edition (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> use marfret;
Database changed
mysql> explain select count(*) as n,im,rt,tc from (select R.type_con as rt, SE.imex as im
...
    -> group by H.id_conteneur) as bob group by im,rt order by rt,tc\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 464
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 2
  select_type: DERIVED
        table: TY
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 37
        Extra: Using temporary; Using filesort
*************************** 3. row ***************************
           id: 2
  select_type: DERIVED
        table: H
         type: ALL
possible_keys: idcont,numeroD,index1,index2
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 35199
        Extra:
*************************** 4. row ***************************
           id: 2
  select_type: DERIVED
        table: P
         type: ref
possible_keys: idetat,mft_refP,ls_conteneur,Id,index7
          key: idetat
      key_len: 13
          ref: marfret.H.Id_conteneur
         rows: 1
        Extra:
*************************** 5. row ***************************
           id: 2
  select_type: DERIVED
        table: R
         type: ref
possible_keys: mft_ref,ref_contrat,index10
          key: mft_ref
      key_len: 52
          ref: marfret.P.Mft_refP
         rows: 1
        Extra:
*************************** 6. row ***************************
           id: 2
  select_type: DERIVED
        table: RE
         type: ref
possible_keys: index1,idmvtportdtyperef,idmvt,stock
          key: index1
      key_len: 4
          ref: marfret.H.Id_mouvement
         rows: 1
        Extra: Using where; Using index
*************************** 7. row ***************************
           id: 2
  select_type: DERIVED
        table: SE
         type: ref
possible_keys: seqc,seqcp,imex,seqc_imex
          key: seqc_imex
      key_len: 270
          ref: func,marfret.TY.type
         rows: 2
        Extra: Using where; Using index
7 rows in set (45.91 sec)

Note different order of tables! And:

mysql> select count(*) as n,im,rt,tc from (select R.type_con as rt ,SE.imex as im
...
    -> group by H.id_conteneur) as bob group by im,rt order by rt,tc;
+-----+----+------+------+
...
+-----+----+------+------+
33 rows in set (43.63 sec)

Suggested fix:
Restore previous behaviour?
[22 Feb 2007 5:51] Igor Babaev
For the query in the subselect (derived table) 4.1 returns the following execution plan:
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| id | select_type | table | type   | possible_keys                          | key       | key_len | ref                    | rows  | Extra                           |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
|  1 | SIMPLE      | H     | ALL    | idcont,numeroD,index1,index2           | NULL      |    NULL | NULL                   | 35111 | Using temporary; Using filesort |
|  1 | SIMPLE      | P     | ref    | idetat,mft_refP,ls_conteneur,Id,index7 | idetat    |      11 | marfret.H.Id_conteneur |     1 |                                 |
|  1 | SIMPLE      | R     | eq_ref | mft_ref,ref_contrat,index10            | mft_ref   |      50 | marfret.P.Mft_refP     |     1 |                                 |
|  1 | SIMPLE      | TY    | ALL    | NULL                                   | NULL      |    NULL | NULL                   |    37 |                                 |
|  1 | SIMPLE      | RE    | ref    | index1,idmvtportdtyperef,idmvt,stock   | index1    |       4 | marfret.H.Id_mouvement |     1 | Using where; Using index        |
|  1 | SIMPLE      | SE    | ref    | seqc,seqcp,imex,seqc_imex              | seqc_imex |     266 | func,marfret.TY.type   |     2 | Using where; Using index        |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+

The execution by this plan on my machine takes:
 (18.90 sec)

If we change the order of tables in the original query we get a different execution plan:

mysql> explain
    -> select R.type_con as rt ,SE.imex as im, H.Lib_mvt,
    ->           R.type_con,R.contrat as tc,
    ->           TY.fr, H.Type_mvt, H.TypeP, H.TypeD, H.Pv,
    ->           H.etat, H.date_mvt, H.Id_sspP,
    ->           CONCAT(H.Type_mvt, H.TypeP, H.TypeD,H.Pv),
    ->           SE.imex , H.Id_conteneur, H.Id_sspD
    ->      from
    ->           (mft_typemvt as TY, mft_ref as R, mft_passeport as P, mft_sequence SE)
    ->           INNER JOIN
    ->           mft_historique_gilles as H
    ->           ON ((P.Mft_refP=R.mft_ref)
    ->               AND (H.Id_conteneur = P.Id_conteneur )
    ->               AND TY.type=SE.imex
    ->               AND CONCAT(H.Type_mvt, H.TypeP, H.TypeD, H.Pv) = SE.sequencec
    ->              )
    ->           left outer join
    ->           (mft_refhisto as RE)
    ->           ON ((RE.id_mouvement=H.Id_mouvement))
    ->      where (((H.numeroD="4499" ) OR ( RE.portD="FRLEH" )) AND SE.imex !="")
    ->      group by H.id_conteneur;
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| id | select_type | table | type   | possible_keys                          | key       | key_len | ref                    | rows  | Extra                           |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
|  1 | SIMPLE      | TY    | ALL    | NULL                                   | NULL      |    NULL | NULL                   |    37 | Using temporary; Using filesort |
|  1 | SIMPLE      | H     | ALL    | idcont,numeroD,index1,index2           | NULL      |    NULL | NULL                   | 35111 |                                 |
|  1 | SIMPLE      | P     | ref    | idetat,mft_refP,ls_conteneur,Id,index7 | idetat    |      11 | marfret.H.Id_conteneur |     1 |                                 |
|  1 | SIMPLE      | R     | eq_ref | mft_ref,ref_contrat,index10            | mft_ref   |      50 | marfret.P.Mft_refP     |     1 |                                 |
|  1 | SIMPLE      | RE    | ref    | index1,idmvtportdtyperef,idmvt,stock   | index1    |       4 | marfret.H.Id_mouvement |     1 | Using where; Using index        |
|  1 | SIMPLE      | SE    | ref    | seqc,seqcp,imex,seqc_imex              | seqc_imex |     266 | func,marfret.TY.type   |     2 | Using where; Using index        |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+

This plan is executed for: 
(1 min 32.90 sec)

This happens because the optimizer does not evaluate all possible execution plans and as a result sometimes it misses the fastest ones.

See continuation in the next comment.
[22 Feb 2007 5:52] Igor Babaev
5.1 is smarter in this respect. We can force the optimizer to evaluate all
possible plans by setting the option optimizer_prune_level to 0. By default
it's value is set to 1 and with it 5.1 returns the following execution plan for
our original query:

+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| id | select_type | table | type   | possible_keys                          | key       | key_len | ref                    | rows  | Extra                           |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
|  1 | SIMPLE      | TY    | ALL    | NULL                                   | NULL      | NULL    | NULL                   |    37 | Using temporary; Using filesort |
|  1 | SIMPLE      | H     | ALL    | idcont,numeroD,index1,index2           | NULL      | NULL    | NULL                   | 35213 |                                 |
|  1 | SIMPLE      | P     | ref    | idetat,mft_refP,ls_conteneur,Id,index7 | idetat    | 13      | marfret.H.Id_conteneur |     1 |                                 |
|  1 | SIMPLE      | R     | eq_ref | mft_ref,ref_contrat,index10            | mft_ref   | 52      | marfret.P.Mft_refP     |     1 |                                 |
|  1 | SIMPLE      | RE    | ref    | index1,idmvtportdtyperef,idmvt,stock   | index1    | 4       | marfret.H.Id_mouvement |     1 | Using where; Using index        |
|  1 | SIMPLE      | SE    | ref    | seqc,seqcp,imex,seqc_imex              | seqc_imex | 270     | func,marfret.TY.type   |     2 | Using where; Using index        |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+

The plan is executed for:
(2 min 3.17 sec)

If I change the value of optimizer_prune_level:

mysql> select @@optimizer_prune_level;
+-------------------------+
| @@optimizer_prune_level |
+-------------------------+
|                       1 |
+-------------------------+
1 row in set (0.00 sec)

mysql> set @@optimizer_prune_level=0;
Query OK, 0 rows affected (0.00 sec)

the optimizer returns a much better plan:

+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| id | select_type | table | type   | possible_keys                          | key       | key_len | ref                    | rows  | Extra                           |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
|  1 | SIMPLE      | H     | ALL    | idcont,numeroD,index1,index2           | NULL      | NULL    | NULL                   | 35213 | Using temporary; Using filesort |
|  1 | SIMPLE      | P     | ref    | idetat,mft_refP,ls_conteneur,Id,index7 | idetat    | 13      | marfret.H.Id_conteneur |     1 |                                 |
|  1 | SIMPLE      | RE    | ref    | index1,idmvtportdtyperef,idmvt,stock   | index1    | 4       | marfret.H.Id_mouvement |     1 | Using where; Using index        |
|  1 | SIMPLE      | TY    | ALL    | NULL                                   | NULL      | NULL    | NULL                   |    37 |                                 |
|  1 | SIMPLE      | SE    | ref    | seqc,seqcp,imex,seqc_imex              | seqc_imex | 270     | func,marfret.TY.type   |     2 | Using where; Using index        |
|  1 | SIMPLE      | R     | eq_ref | mft_ref,ref_contrat,index10            | mft_ref   | 52      | marfret.P.Mft_refP     |     1 |                                 |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+

whose execution time is:
 (5.58 sec)

And the choice of this plan does not depend on the order of tables in the query:

mysql> explain
    -> select R.type_con as rt ,SE.imex as im, H.Lib_mvt,
    ->           R.type_con,R.contrat as tc,
    ->           TY.fr, H.Type_mvt, H.TypeP, H.TypeD, H.Pv,
    ->           H.etat, H.date_mvt, H.Id_sspP,
    ->           CONCAT(H.Type_mvt, H.TypeP, H.TypeD,H.Pv),
    ->           SE.imex , H.Id_conteneur, H.Id_sspD
    ->      from
    ->           (mft_ref as R, mft_passeport as P, mft_sequence SE, mft_typemvt as TY)
    ->           INNER JOIN
    ->           mft_historique_gilles as H
    ->           ON ((P.Mft_refP=R.mft_ref)
    ->               AND (H.Id_conteneur = P.Id_conteneur )
    ->               AND TY.type=SE.imex
    ->               AND CONCAT(H.Type_mvt, H.TypeP, H.TypeD, H.Pv) = SE.sequencec
    ->              )
    ->           left outer join
    ->           (mft_refhisto as RE)
    ->           ON ((RE.id_mouvement=H.Id_mouvement))
    ->      where (((H.numeroD="4499" ) OR ( RE.portD="FRLEH" )) AND SE.imex !="")
    ->      group by H.id_conteneur;
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| id | select_type | table | type   | possible_keys                          | key       | key_len | ref                    | rows  | Extra                           |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
|  1 | SIMPLE      | H     | ALL    | idcont,numeroD,index1,index2           | NULL      | NULL    | NULL                   | 35213 | Using temporary; Using filesort |
|  1 | SIMPLE      | P     | ref    | idetat,mft_refP,ls_conteneur,Id,index7 | idetat    | 13      | marfret.H.Id_conteneur |     1 |                                 |
|  1 | SIMPLE      | RE    | ref    | index1,idmvtportdtyperef,idmvt,stock   | index1    | 4       | marfret.H.Id_mouvement |     1 | Using where; Using index        |
|  1 | SIMPLE      | TY    | ALL    | NULL                                   | NULL      | NULL    | NULL                   |    37 |                                 |
|  1 | SIMPLE      | SE    | ref    | seqc,seqcp,imex,seqc_imex              | seqc_imex | 270     | func,marfret.TY.type   |     2 | Using where; Using index        |
|  1 | SIMPLE      | R     | eq_ref | mft_ref,ref_contrat,index10            | mft_ref   | 52      | marfret.P.Mft_refP     |     1 |                                 |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+

Conclusion: there is no performance degradation for the customer's query.
This is not a bug.
[22 Feb 2007 16:37] Igor Babaev
Correction for my previous [open] comment:
the option optimizer_prune_level is valid for 5.0+ and
I checked the queries I referred to with 5.0 (never checked them with 5.1, but nothing changed in this functionality).