| Bug #26528 | Optimizer produces inefficient execution plan on 5.0.x compared to 4.1.21 | ||
|---|---|---|---|
| Submitted: | 21 Feb 2007 15:02 | Modified: | 22 Feb 2007 16:21 |
| Reporter: | Valeriy Kravchuk | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 5.0.34, 5.0.27 | OS: | Windows (Windows) |
| Assigned to: | Igor Babaev | CPU Architecture: | Any |
| Tags: | regression | ||
[22 Feb 2007 5:51]
Igor Babaev
For the query in the subselect (derived table) 4.1 returns the following execution plan:
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| 1 | SIMPLE | H | ALL | idcont,numeroD,index1,index2 | NULL | NULL | NULL | 35111 | Using temporary; Using filesort |
| 1 | SIMPLE | P | ref | idetat,mft_refP,ls_conteneur,Id,index7 | idetat | 11 | marfret.H.Id_conteneur | 1 | |
| 1 | SIMPLE | R | eq_ref | mft_ref,ref_contrat,index10 | mft_ref | 50 | marfret.P.Mft_refP | 1 | |
| 1 | SIMPLE | TY | ALL | NULL | NULL | NULL | NULL | 37 | |
| 1 | SIMPLE | RE | ref | index1,idmvtportdtyperef,idmvt,stock | index1 | 4 | marfret.H.Id_mouvement | 1 | Using where; Using index |
| 1 | SIMPLE | SE | ref | seqc,seqcp,imex,seqc_imex | seqc_imex | 266 | func,marfret.TY.type | 2 | Using where; Using index |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
The execution by this plan on my machine takes:
(18.90 sec)
If we change the order of tables in the original query we get a different execution plan:
mysql> explain
-> select R.type_con as rt ,SE.imex as im, H.Lib_mvt,
-> R.type_con,R.contrat as tc,
-> TY.fr, H.Type_mvt, H.TypeP, H.TypeD, H.Pv,
-> H.etat, H.date_mvt, H.Id_sspP,
-> CONCAT(H.Type_mvt, H.TypeP, H.TypeD,H.Pv),
-> SE.imex , H.Id_conteneur, H.Id_sspD
-> from
-> (mft_typemvt as TY, mft_ref as R, mft_passeport as P, mft_sequence SE)
-> INNER JOIN
-> mft_historique_gilles as H
-> ON ((P.Mft_refP=R.mft_ref)
-> AND (H.Id_conteneur = P.Id_conteneur )
-> AND TY.type=SE.imex
-> AND CONCAT(H.Type_mvt, H.TypeP, H.TypeD, H.Pv) = SE.sequencec
-> )
-> left outer join
-> (mft_refhisto as RE)
-> ON ((RE.id_mouvement=H.Id_mouvement))
-> where (((H.numeroD="4499" ) OR ( RE.portD="FRLEH" )) AND SE.imex !="")
-> group by H.id_conteneur;
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| 1 | SIMPLE | TY | ALL | NULL | NULL | NULL | NULL | 37 | Using temporary; Using filesort |
| 1 | SIMPLE | H | ALL | idcont,numeroD,index1,index2 | NULL | NULL | NULL | 35111 | |
| 1 | SIMPLE | P | ref | idetat,mft_refP,ls_conteneur,Id,index7 | idetat | 11 | marfret.H.Id_conteneur | 1 | |
| 1 | SIMPLE | R | eq_ref | mft_ref,ref_contrat,index10 | mft_ref | 50 | marfret.P.Mft_refP | 1 | |
| 1 | SIMPLE | RE | ref | index1,idmvtportdtyperef,idmvt,stock | index1 | 4 | marfret.H.Id_mouvement | 1 | Using where; Using index |
| 1 | SIMPLE | SE | ref | seqc,seqcp,imex,seqc_imex | seqc_imex | 266 | func,marfret.TY.type | 2 | Using where; Using index |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
This plan is executed for:
(1 min 32.90 sec)
This happens because the optimizer does not evaluate all possible execution plans and as a result sometimes it misses the fastest ones.
See continuation in the next comment.
[22 Feb 2007 5:52]
Igor Babaev
5.1 is smarter in this respect. We can force the optimizer to evaluate all
possible plans by setting the option optimizer_prune_level to 0. By default
it's value is set to 1 and with it 5.1 returns the following execution plan for
our original query:
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| 1 | SIMPLE | TY | ALL | NULL | NULL | NULL | NULL | 37 | Using temporary; Using filesort |
| 1 | SIMPLE | H | ALL | idcont,numeroD,index1,index2 | NULL | NULL | NULL | 35213 | |
| 1 | SIMPLE | P | ref | idetat,mft_refP,ls_conteneur,Id,index7 | idetat | 13 | marfret.H.Id_conteneur | 1 | |
| 1 | SIMPLE | R | eq_ref | mft_ref,ref_contrat,index10 | mft_ref | 52 | marfret.P.Mft_refP | 1 | |
| 1 | SIMPLE | RE | ref | index1,idmvtportdtyperef,idmvt,stock | index1 | 4 | marfret.H.Id_mouvement | 1 | Using where; Using index |
| 1 | SIMPLE | SE | ref | seqc,seqcp,imex,seqc_imex | seqc_imex | 270 | func,marfret.TY.type | 2 | Using where; Using index |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
The plan is executed for:
(2 min 3.17 sec)
If I change the value of optimizer_prune_level:
mysql> select @@optimizer_prune_level;
+-------------------------+
| @@optimizer_prune_level |
+-------------------------+
| 1 |
+-------------------------+
1 row in set (0.00 sec)
mysql> set @@optimizer_prune_level=0;
Query OK, 0 rows affected (0.00 sec)
the optimizer returns a much better plan:
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| 1 | SIMPLE | H | ALL | idcont,numeroD,index1,index2 | NULL | NULL | NULL | 35213 | Using temporary; Using filesort |
| 1 | SIMPLE | P | ref | idetat,mft_refP,ls_conteneur,Id,index7 | idetat | 13 | marfret.H.Id_conteneur | 1 | |
| 1 | SIMPLE | RE | ref | index1,idmvtportdtyperef,idmvt,stock | index1 | 4 | marfret.H.Id_mouvement | 1 | Using where; Using index |
| 1 | SIMPLE | TY | ALL | NULL | NULL | NULL | NULL | 37 | |
| 1 | SIMPLE | SE | ref | seqc,seqcp,imex,seqc_imex | seqc_imex | 270 | func,marfret.TY.type | 2 | Using where; Using index |
| 1 | SIMPLE | R | eq_ref | mft_ref,ref_contrat,index10 | mft_ref | 52 | marfret.P.Mft_refP | 1 | |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
whose execution time is:
(5.58 sec)
And the choice of this plan does not depend on the order of tables in the query:
mysql> explain
-> select R.type_con as rt ,SE.imex as im, H.Lib_mvt,
-> R.type_con,R.contrat as tc,
-> TY.fr, H.Type_mvt, H.TypeP, H.TypeD, H.Pv,
-> H.etat, H.date_mvt, H.Id_sspP,
-> CONCAT(H.Type_mvt, H.TypeP, H.TypeD,H.Pv),
-> SE.imex , H.Id_conteneur, H.Id_sspD
-> from
-> (mft_ref as R, mft_passeport as P, mft_sequence SE, mft_typemvt as TY)
-> INNER JOIN
-> mft_historique_gilles as H
-> ON ((P.Mft_refP=R.mft_ref)
-> AND (H.Id_conteneur = P.Id_conteneur )
-> AND TY.type=SE.imex
-> AND CONCAT(H.Type_mvt, H.TypeP, H.TypeD, H.Pv) = SE.sequencec
-> )
-> left outer join
-> (mft_refhisto as RE)
-> ON ((RE.id_mouvement=H.Id_mouvement))
-> where (((H.numeroD="4499" ) OR ( RE.portD="FRLEH" )) AND SE.imex !="")
-> group by H.id_conteneur;
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
| 1 | SIMPLE | H | ALL | idcont,numeroD,index1,index2 | NULL | NULL | NULL | 35213 | Using temporary; Using filesort |
| 1 | SIMPLE | P | ref | idetat,mft_refP,ls_conteneur,Id,index7 | idetat | 13 | marfret.H.Id_conteneur | 1 | |
| 1 | SIMPLE | RE | ref | index1,idmvtportdtyperef,idmvt,stock | index1 | 4 | marfret.H.Id_mouvement | 1 | Using where; Using index |
| 1 | SIMPLE | TY | ALL | NULL | NULL | NULL | NULL | 37 | |
| 1 | SIMPLE | SE | ref | seqc,seqcp,imex,seqc_imex | seqc_imex | 270 | func,marfret.TY.type | 2 | Using where; Using index |
| 1 | SIMPLE | R | eq_ref | mft_ref,ref_contrat,index10 | mft_ref | 52 | marfret.P.Mft_refP | 1 | |
+----+-------------+-------+--------+----------------------------------------+-----------+---------+------------------------+-------+---------------------------------+
Conclusion: there is no performance degradation for the customer's query.
This is not a bug.
[22 Feb 2007 16:37]
Igor Babaev
Correction for my previous [open] comment: the option optimizer_prune_level is valid for 5.0+ and I checked the queries I referred to with 5.0 (never checked them with 5.1, but nothing changed in this functionality).

Description: Some complex queries with derived tables works 3+ times slower on 5.0.27 compared to 4.1.21 (with the same default my.ini settings, same hardware and on same data). For example, this one: select count(*) as n,im,rt,tc from (select R.type_con as rt ,SE.imex as im, H.Lib_mvt, R.type_con,R.contrat as tc, TY.fr, H.Type_mvt, H.TypeP, H.TypeD, H.Pv, H.etat, H.date_mvt, H.Id_sspP, CONCAT(H.Type_mvt, H.TypeP, H.TypeD,H.Pv), SE.imex , H.Id_conteneur, H.Id_sspD from mft_historique_gilles as H INNER JOIN (mft_ref as R, mft_passeport as P, mft_sequence SE, mft_typemvt as TY) ON ((P.Mft_refP=R.mft_ref) AND (H.Id_conteneur = P.Id_conteneur ) AND Ty.type=SE.imex AND CONCAT(H.Type_mvt, H.TypeP, H.TypeD, H.Pv) = SE.sequencec ) left outer join (mft_refhisto as RE) ON ((Re.id_mouvement=H.Id_mouvement)) where (((H.numeroD="4499" ) OR ( Re.portD="FRLEH" )) AND SE.imex !="") group by H.id_conteneur) as bob group by im, rt order by rt,tc runs 13+ seconds on 4.1.21 and 43+ seconds (with 100% CPU load) on 5.0.27 on the same data. ANALYZE TABLE for all the tables involved does not change query plan. How to repeat: C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot test -P 3307 Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 11 Server version: 4.1.21-community-nt Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> use marfret Database changed mysql> select count(*) as n,im,rt,tc from (select R.type_con as rt ,SE.imex as im ... -> group by H.id_conteneur) as bob group by im,rt order by rt,tc -> ; ... 33 rows in set (13.72 sec) mysql> explain select count(*) as n,im,rt,tc from (select R.type_con as rt, SE.imex as im ... -> group by H.id_conteneur) as bob group by im,rt order by rt,tc\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 464 Extra: Using temporary; Using filesort *************************** 2. row *************************** id: 2 select_type: DERIVED table: H type: ALL possible_keys: idcont,numeroD,index1,index2 key: NULL key_len: NULL ref: NULL rows: 35482 Extra: Using temporary; Using filesort *************************** 3. row *************************** id: 2 select_type: DERIVED table: P type: ref possible_keys: idetat,mft_refP,ls_conteneur,Id,index7 key: idetat key_len: 11 ref: marfret.H.Id_conteneur rows: 1 Extra: *************************** 4. row *************************** id: 2 select_type: DERIVED table: R type: eq_ref possible_keys: mft_ref,ref_contrat,index10 key: mft_ref key_len: 50 ref: marfret.P.Mft_refP rows: 1 Extra: *************************** 5. row *************************** id: 2 select_type: DERIVED table: TY type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 37 Extra: *************************** 6. row *************************** id: 2 select_type: DERIVED table: RE type: ref possible_keys: index1,idmvtportdtyperef,idmvt,stock key: index1 key_len: 4 ref: marfret.H.Id_mouvement rows: 1 Extra: Using where; Using index *************************** 7. row *************************** id: 2 select_type: DERIVED table: SE type: ref possible_keys: seqc,seqcp,imex,seqc_imex key: seqc_imex key_len: 266 ref: func,marfret.TY.type rows: 2 Extra: Using where; Using index 7 rows in set (8.13 sec) Note the order of table access! Now, with the same data: C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -P 3308 Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 15 Server version: 5.0.27-community-nt MySQL Community Edition (GPL) Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> use marfret; Database changed mysql> explain select count(*) as n,im,rt,tc from (select R.type_con as rt, SE.imex as im ... -> group by H.id_conteneur) as bob group by im,rt order by rt,tc\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 464 Extra: Using temporary; Using filesort *************************** 2. row *************************** id: 2 select_type: DERIVED table: TY type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 37 Extra: Using temporary; Using filesort *************************** 3. row *************************** id: 2 select_type: DERIVED table: H type: ALL possible_keys: idcont,numeroD,index1,index2 key: NULL key_len: NULL ref: NULL rows: 35199 Extra: *************************** 4. row *************************** id: 2 select_type: DERIVED table: P type: ref possible_keys: idetat,mft_refP,ls_conteneur,Id,index7 key: idetat key_len: 13 ref: marfret.H.Id_conteneur rows: 1 Extra: *************************** 5. row *************************** id: 2 select_type: DERIVED table: R type: ref possible_keys: mft_ref,ref_contrat,index10 key: mft_ref key_len: 52 ref: marfret.P.Mft_refP rows: 1 Extra: *************************** 6. row *************************** id: 2 select_type: DERIVED table: RE type: ref possible_keys: index1,idmvtportdtyperef,idmvt,stock key: index1 key_len: 4 ref: marfret.H.Id_mouvement rows: 1 Extra: Using where; Using index *************************** 7. row *************************** id: 2 select_type: DERIVED table: SE type: ref possible_keys: seqc,seqcp,imex,seqc_imex key: seqc_imex key_len: 270 ref: func,marfret.TY.type rows: 2 Extra: Using where; Using index 7 rows in set (45.91 sec) Note different order of tables! And: mysql> select count(*) as n,im,rt,tc from (select R.type_con as rt ,SE.imex as im ... -> group by H.id_conteneur) as bob group by im,rt order by rt,tc; +-----+----+------+------+ ... +-----+----+------+------+ 33 rows in set (43.63 sec) Suggested fix: Restore previous behaviour?