MySQL Bugs: #19330: Optimizer chooses wrong table order then query does not come back

Bug #19330	Optimizer chooses wrong table order then query does not come back
Submitted:	25 Apr 2006 9:56	Modified:	6 Feb 2014 13:58
Reporter:	Oli Sennhauser	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S5 (Performance)
Version:	5.0.20	OS:	Linux (linux)
Assigned to:	Assigned Account	CPU Architecture:	Any

Description:
Optimizer chooses wrong table order then query does not come back due to wrong execution plan.

Table ft must come AFTER table dstop!

How to repeat:
See file attached data file.
mysql -u root dbrent < file.dmp
CHECK TABLE drdat, mitarbeiter, firma, stadt, fsdaten, fahrt;
SELECT ... (see attachement).

query and execution plan

Attachment: rechnungslauf.sql (text/x-sql), 29.77 KiB.

Thank you for a problem report. Please, send the my.cnf file content. 

To me the first plan, selected by optimizer, looks better than second one, by the way. At least, if take into account number of rows to lookup.

Hi Valeriy

I do fully agree with you again. The execution plan looks better. But with STRAIGTH_JOIN it runs 4 sec. on my machine and without it is not back after ... min.

my.cnf

[mysqld]
port            = 3310
socket          = /u00/app/mysql/tmp/mysql-5.0.20.sock

log_slow_queries       = slow.log
long_query_time        = 1
log_long_format

lower_case_table_names = 1
log_bin                = bin_log
innodb_status_file     = innodb_status.log
myisam_sort_buffer_size = 32M

default.key_buffer_size         = 32M
hot.key_buffer_size         = 32M
cold.key_buffer_size         = 32M

I was able to repeat this using the attached file with 5.0 from bk

This problem can be demonstrated on the following two simple queries that use the same database state as the reported queries.

The first query is executed very quickly:

mysql> SELECT
    -> STRAIGHT_JOIN
    -> COUNT(*)
    ->  FROM drdat d
    ->   LEFT JOIN drdat dstop ON d.firma_id=dstop.firma_id AND d.stadt_id=dstop.stadt_id
    ->                                                      AND d.druckdatum<dstop.druckdatum
    ->                                                      AND dstop.abrechnungstyp='MASSEN'
    ->  INNER JOIN fsdaten fst USE INDEX (firma_id) ON fst.firma_id=d.firma_id AND fst.abr_stadt_id=d.stadt_id
    ->  INNER JOIN fahrt ft ON ft.firma_id=d.firma_id AND ft.firma_id=fst.firma_id
    ->                                                AND ft.stadt_id=fst.stadt_id
    ->                                                AND ft.druckdatum='0000-00-00'
    ->                                                AND ft.drdat_id IS NULL
    -> WHERE
    ->   d.druckdatum!='1910-10-10'
    ->   AND d.druckdatum!='1911-11-11'
    ->   AND d.druckdatum!='1912-12-12'
    ->   AND d.abrechnungstyp='MASSEN'
    ->   AND d.firma_id IN (1,2,3,4,5,9,10,11,12,99,100,102,103,104,111,
    ->                      200,201,202,203,204,205,206,209,
    ->                      300,302,328,329,330,331,332,335,336,337,338,339,
    ->                      340,341,342,343,344,345,346,347,348,349,
    ->                      500,501,1000,2000,3000,4000,
    ->                      5000,5100,5200,5300,5500,6000,7000,8000)
    ->  AND dstop.id IS NULL;
+----------+
| COUNT(*) |
+----------+
|    66070 |
+----------+
1 row in set (0.92 sec)

This is the execution plan by which the first query is executed:

+----+-------------+-------+------+---------------------------------------------+------------+---------+-------------------------------------------+------+-------------------------+
| id | select_type | table | type | possible_keys                               | key        | key_len | ref                                       | rows | Extra                   |
+----+-------------+-------+------+---------------------------------------------+------------+---------+-------------------------------------------+------+-------------------------+
|  1 | SIMPLE      | d     | ALL  | uniqueindex,firma_id                        | NULL       | NULL    | NULL                                      | 8616 | Using where             |
|  1 | SIMPLE      | dstop | ref  | uniqueindex,firma_id                        | firma_id   | 8       | test.d.firma_id,test.d.stadt_id           |  115 | Using where; Not exists |
|  1 | SIMPLE      | fst   | ref  | firma_id                                    | firma_id   | 8       | test.d.firma_id,test.d.stadt_id           |    5 |                         |
|  1 | SIMPLE      | ft    | ref  | druckdatum,drdat_id_rechnung_neu,neu_relauf | neu_relauf | 13      | test.fst.firma_id,const,test.fst.stadt_id |    8 | Using where             |
+----+-------------+-------+------+---------------------------------------------+------------+---------+-------------------------------------------+------+-------------------------+
4 rows in set (0.02 sec)

The second query is equivalent to the first one differing from the latter only by omission of the STRAIGHT_JOIN option under SELECT. Yet an execution of the second query takes much more time:

mysql> SELECT
    -> COUNT(*)
    ->  FROM drdat d
    ->   LEFT JOIN drdat dstop ON d.firma_id=dstop.firma_id AND d.stadt_id=dstop.stadt_id
    ->                                                      AND d.druckdatum<dstop.druckdatum
    ->                                                      AND dstop.abrechnungstyp='MASSEN'
    ->  INNER JOIN fsdaten fst USE INDEX (firma_id) ON fst.firma_id=d.firma_id AND fst.abr_stadt_id=d.stadt_id
    ->  INNER JOIN fahrt ft ON ft.firma_id=d.firma_id AND ft.firma_id=fst.firma_id
    ->                                                AND ft.stadt_id=fst.stadt_id
    ->                                                AND ft.druckdatum='0000-00-00'
    ->                                                AND ft.drdat_id IS NULL
    -> WHERE
    ->   d.druckdatum!='1910-10-10'
    ->   AND d.druckdatum!='1911-11-11'
    ->   AND d.druckdatum!='1912-12-12'
    ->   AND d.abrechnungstyp='MASSEN'
    ->   AND d.firma_id IN (1,2,3,4,5,9,10,11,12,99,100,102,103,104,111,
    ->                      200,201,202,203,204,205,206,209,
    ->                      300,302,328,329,330,331,332,335,336,337,338,339,
    ->                      340,341,342,343,344,345,346,347,348,349,
    ->                      500,501,1000,2000,3000,4000,
    ->                      5000,5100,5200,5300,5500,6000,7000,8000)
    ->  AND dstop.id IS NULL;
+----------+
| COUNT(*) |
+----------+
|    66070 |
+----------+
1 row in set (8 min 27.59 sec)

The second query is executed by the following plan:

+----+-------------+-------+------+---------------------------------------------+------------+---------+-------------------------------------------+------+-------------------------+
| id | select_type | table | type | possible_keys                               | key        | key_len | ref                                       | rows | Extra                   |
+----+-------------+-------+------+---------------------------------------------+------------+---------+-------------------------------------------+------+-------------------------+
|  1 | SIMPLE      | fst   | ALL  | firma_id                                    | NULL       | NULL    | NULL                                      |  325 | Using where             |
|  1 | SIMPLE      | ft    | ref  | druckdatum,drdat_id_rechnung_neu,neu_relauf | neu_relauf | 13      | test.fst.firma_id,const,test.fst.stadt_id |    8 | Using where             |
|  1 | SIMPLE      | d     | ref  | uniqueindex,firma_id                        | firma_id   | 8       | test.ft.firma_id,test.fst.abr_stadt_id    |  115 | Using where             |
|  1 | SIMPLE      | dstop | ref  | uniqueindex,firma_id                        | firma_id   | 8       | test.d.firma_id,test.fst.abr_stadt_id     |  115 | Using where; Not exists |
+----+-------------+-------+------+---------------------------------------------+------------+---------+-------------------------------------------+------+-------------------------+

-- the comment to be continued

-- continuation of the previous comment

The estimated cost of the second plan is really less than the estimated cost
of the first plan. That's why the optimizer selects the second plan when we 
don't use the STRAIGHT_JOIN option.

The estimated number of rows for the first plan is equal to
  8616*115*5*8=39642800,
while for the second plan the estimated number of rows is equal to
  325*8*115*115=34385000.

Note with 'Not exists' optimization the fanout for the left join operation
must be 1 rather then 115. This is abug and it will bw fixed later.
After this fix the expected number of rows will be
  for the first query: 8616*1*5*8=344640
  for the second query: 325*8*115*1=299000.

In reality we have the following cardinalities for partial joins.
 first, quick query:
 we have:
     table to join        cardinality of partial join    real fanout
       d                              1856                0.2154
       dstop                            57                0.0307
       fst                             292                5.12
       ft                            66070                226
 instead of predicted:
     table to join        cardinality of partial join    real fanout
       d                              8616                1
       dstop                          8616                1
       fst                           43080                5
       ft                           344640                8

 second, slow query:
 we have:
     table to join        cardinality of partial join    real fanout
       fst                                325                1
       ft                               66101                203
       d                              3065772                46
       dstop                            66070                0.02
 instead of predicted:
     table to join        cardinality of partial join    real fanout
       fst                               325                 1
       ft                               2600                 8
       d                              229000                 115
       dstop                          229000                 1

We make a huge miss with our estimates by two reasons:
1. MySQL use a uniform simplistic model for distribution of values
which does not work for the tables in the example (there the values are skewed).
2. We do not make a good estimate for selectivity of the condition in the antijoin:  
 dstop ON d.firma_id=dstop.firma_id 
 AND d.stadt_id=dstop.stadt_id
 AND d.druckdatum<dstop.druckdatum
 AND dstop.abrechnungstyp='MASSEN'.

The high selectivity of this condition makes the execution by the first plan dramatically better than by second one. Yet we hardly can calculate this selectivity.

The above analysis allows me to state that we won't be able to fix this problem until we introduce much more sophisticated database statistics ito MySQL.