Bug #45969 the query otimizer choose the wrong index
Submitted: 6 Jul 2009 12:23 Modified: 8 Oct 2012 15:31
Reporter: Manfred Wiedemeier Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.35, 5.1.38-bzr OS:Microsoft Windows (XP, Vista)
Assigned to: CPU Architecture:Any
Tags: regression
Triage: Triaged: D2 (Serious) / R4 (High) / E4 (High)

[6 Jul 2009 12:23] Manfred Wiedemeier
Description:
In version 5.0.77, the otimizer choose the index `recid_r02`, for the following select. In version 5.1.35 it's now `key6`.

If i change: ORDER BY r03.r_datum DESC
         to: ORDER BY r03.recid_r02, r03.r_datum DESC
the index is correct.

It seems that the optimizer did not mention that i did not use the field `r03.r_datum` in the where clause.

SELECT r02.*,
       (SELECT r03.r_datum 
          FROM r03 
          WHERE ((r03.recid_r02 = r02.recid) AND (r03.texttype IN (4,7,8,9))) 
          ORDER BY r03.r_datum DESC
          LIMIT 1
       ) AS last_order

  FROM r02
  WHERE (r02.auftrag > 0) AND (r02.deleted = FALSE)
  ORDER BY r02.auftrag DESC
  LIMIT 50

Tables:

CREATE TABLE `r02` (
  `recid`    int(11) NOT NULL DEFAULT '0',
  `deleted`  tinyint(1) NOT NULL DEFAULT '0',
  `auftrag`  int(11) NOT NULL DEFAULT '0',
  `nachtrag` int(11) NOT NULL DEFAULT '0',
  `kurzbez`  varchar(60) COLLATE latin1_general_ci NOT NULL,
  `flags`    bigint(20) NOT NULL DEFAULT '0',

  PRIMARY KEY (`recid`),
  UNIQUE  KEY `key` (`auftrag`,`nachtrag`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci;

CREATE TABLE `r03` (
  `recid`     int(11) NOT NULL DEFAULT '0',
  `recid_r02` int(11) NOT NULL DEFAULT '0',
  `deleted`   tinyint(1) NOT NULL DEFAULT '0',
  `auftrag`   int(11) NOT NULL DEFAULT '0',
  `nachtrag`  int(11) NOT NULL DEFAULT '0',
  `texttype`  int(11) NOT NULL DEFAULT '0',
  `nummer`    int(11) NOT NULL DEFAULT '0',
  `rechn_nr`  varchar(20) COLLATE latin1_general_ci DEFAULT NULL,
  `doku_nr`   varchar(20) COLLATE latin1_general_ci DEFAULT NULL,
  `betrag`    double NOT NULL DEFAULT '0',
  `r_datum`   date NOT NULL DEFAULT '0000-00-00',

  PRIMARY KEY (`recid`),
  UNIQUE KEY `key` (`auftrag`,`nachtrag`,`texttype`,`nummer`,`deleted`),
  UNIQUE KEY `key2` (`rechn_nr`),
  UNIQUE KEY `key3` (`doku_nr`),
  KEY `key6` (`r_datum`),
  KEY `recid_r02` (`recid_r02`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci;

How to repeat:
create the tables r02 and r03. 
I've got 30000 records in r02 and 90000 record in r03.
[6 Jul 2009 12:29] Valeriy Kravchuk
OK, different index is used (this is related to partial fix for bug #28404 we have in 5.1.x). But why do you think this is a bug? Do you have seriously slower execution?

Please, send also the results of:

SELECT recid_r02, count(*) FROM r03
GROUP BY recid_r02
ORDER BY 2 DESC LIMIT 10;

I want to know how many rows in the table may have particular recid_r02 value.
[20 Jul 2009 11:03] Manfred Wiedemeier
I've got 85.000 records in R02 and 80.000 in R03.
The execution time has changed from < 1 sec. to > 280 sec.

SELECT recid_r02, count(*) FROM r03
GROUP BY recid_r02
ORDER BY 2 DESC LIMIT 10;

recid_r02    count(*)
    27256        3361
    23569        3222
    19452        3155
    15370        3065
    12801        2407
    12796        1996
    12793        1859
        0         513
    12802         301
    29625         282
[22 Jul 2009 9:28] Sveta Smirnova
Thank you for the feedback.

I can not repeat described behavior: in case of MySQL-5.1 I get recid_r02  too. Please run OPTIMIZE TABLE on 5.1 server and inform us if it changed results.
[27 Jul 2009 8:36] Manfred Wiedemeier
I optimize the tables, same result. 
The optimizer choose index `key6` and not `recid_r02`.

If you like i can send you my database.
[27 Jul 2009 9:25] Sveta Smirnova
Thank you for the feedback.

> If you like i can send you my database.

Yes, this would be good.
[27 Jul 2009 11:26] Manfred Wiedemeier
I have uploaded "bug-data-45969.zip" to your ftp-server.
[30 Jul 2009 13:43] Valeriy Kravchuk
Verified just as described using your tables uploaded with latest 5.1.38-bzr on Linux:

openxs@suse:/home2/openxs/dbs/5.1> bin/mysql -uroot reg00003
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.38-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show tables;
+--------------------+
| Tables_in_reg00003 |
+--------------------+
| r02                |
| r03                |
+--------------------+
2 rows in set (0.01 sec)

mysql> explain SELECT r02.*,
    ->        (SELECT r03.r_datum
    ->           FROM r03
    ->           WHERE ((r03.recid_r02 = r02.recid) AND (r03.texttype IN (4,7,8,9)))
    ->           ORDER BY r03.r_datum DESC
    ->           LIMIT 1
    ->        ) AS last_order
    ->
    ->   FROM r02
    ->   WHERE (r02.auftrag > 0) AND (r02.deleted = FALSE)
    ->   ORDER BY r02.auftrag DESC
    ->   LIMIT 50
    ->
    -> ;
+----+--------------------+-------+-------+---------------+------+---------+------+-------+-------------+
| id | select_type        | table | type  | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+--------------------+-------+-------+---------------+------+---------+------+-------+-------------+
|  1 | PRIMARY            | r02   | range | key           | key  | 4       | NULL | 27056 | Using where |
|  2 | DEPENDENT SUBQUERY | r03   | index | recid_r02     | key6 | 3       | NULL |     1 | Using where |
+----+--------------------+-------+-------+---------------+------+---------+------+-------+-------------+
2 rows in set (0.02 sec)

As you correctly noted index recid_r02 can be used:

openxs@suse:/home2/openxs/dbs/5.1> bin/mysql -uroot reg00003
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.38-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show tables;
+--------------------+
| Tables_in_reg00003 |
+--------------------+
| r02                |
| r03                |
+--------------------+
2 rows in set (0.01 sec)

mysql> explain SELECT r02.*,
    ->        (SELECT r03.r_datum
    ->           FROM r03
    ->           WHERE ((r03.recid_r02 = r02.recid) AND (r03.texttype IN (4,7,8,9)))
    ->           ORDER BY r03.r_datum DESC
    ->           LIMIT 1
    ->        ) AS last_order
    ->
    ->   FROM r02
    ->   WHERE (r02.auftrag > 0) AND (r02.deleted = FALSE)
    ->   ORDER BY r02.auftrag DESC
    ->   LIMIT 50
    ->
    -> ;
+----+--------------------+-------+-------+---------------+------+---------+------+-------+-------------+
| id | select_type        | table | type  | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+--------------------+-------+-------+---------------+------+---------+------+-------+-------------+
|  1 | PRIMARY            | r02   | range | key           | key  | 4       | NULL | 27056 | Using where |
|  2 | DEPENDENT SUBQUERY | r03   | index | recid_r02     | key6 | 3       | NULL |     1 | Using where |
+----+--------------------+-------+-------+---------------+------+---------+------+-------+-------------+
2 rows in set (0.02 sec)

mysql> explain SELECT r02.*,        (SELECT r03.r_datum            FROM r03            WHERE ((r03.recid_r02 = r02.recid) AND (r03.texttype IN (4,7,8,9)))            ORDER BY r03.recid_r02, r03.r_datum DESC           LIMIT 1        ) AS last_order    FROM r02   WHERE (r02.auftrag > 0) AND (r02.deleted = FALSE)   ORDER BY r02.auftrag DESC   LIMIT 50;
+----+--------------------+-------+-------+---------------+-----------+---------+--------------------+-------+-----------------------------+
| id | select_type        | table | type  | possible_keys | key       | key_len | ref                | rows  | Extra                       |
+----+--------------------+-------+-------+---------------+-----------+---------+--------------------+-------+-----------------------------+
|  1 | PRIMARY            | r02   | range | key           | key       | 4       | NULL               | 27056 | Using where                 |
|  2 | DEPENDENT SUBQUERY | r03   | ref   | recid_r02     | recid_r02 | 4       | reg00003.r02.RECID |     3 | Using where; Using filesort |
+----+--------------------+-------+-------+---------------+-----------+---------+--------------------+-------+-----------------------------+
2 rows in set (0.00 sec)

and, actually, query is much faster when it is used:

mysql> SELECT r02.*,        (SELECT r03.r_datum            FROM r03  FORCE INDEX(recid_r02)          WHERE ((r03.recid_r02 = r02.recid) AND (r03.texttype IN (4,7,8,9)))            ORDER BY r03.r_datum DESC           LIMIT 1        ) AS last_order    FROM r02   WHERE (r02.auftrag > 0) AND (r02.deleted = FALSE)   ORDER BY r02.auftrag DESC   LIMIT 50;
+-------+---------+---------+----------+--------------------------------------------------+----------+------------+
| RECID | DELETED | AUFTRAG | NACHTRAG | KURZBEZ                                          | FLAGS    | last_order |
+-------+---------+---------+----------+--------------------------------------------------+----------+------------+
| 12803 |       0 |   50001 |        0 | Mustervorlage fr Dokumentation                  |     4096 | NULL       |

...

| 30399 |       0 |   42209 |        0 | Studentenversion                                 |      768 | NULL       |
+-------+---------+---------+----------+--------------------------------------------------+----------+------------+
50 rows in set (0.48 sec)

mysql> SELECT r02.*,        (SELECT r03.r_datum            FROM r03            WHERE ((r03.recid_r02 = r02.recid) AND (r03.texttype IN (4,7,8,9)))            ORDER BY r03.r_datum DESC           LIMIT 1        ) AS last_order    FROM r02   WHERE (r02.auftrag > 0) AND (r02.deleted = FALSE)   ORDER BY r02.auftrag DESC   LIMIT 50;
+-------+---------+---------+----------+--------------------------------------------------+----------+------------+
| RECID | DELETED | AUFTRAG | NACHTRAG | KURZBEZ                                          | FLAGS    | last_order |
+-------+---------+---------+----------+--------------------------------------------------+----------+------------+
| 12803 |       0 |   50001 |        0 | Mustervorlage fr Dokumentation                  |     4096 | NULL       |

...

| 30399 |       0 |   42209 |        0 | Studentenversion                                 |      768 | NULL       |
+-------+---------+---------+----------+--------------------------------------------------+----------+------------+
50 rows in set (1 min 5.75 sec)

So, one minute instead of less than a second...
[21 Apr 2011 7:53] Daniƫl van Eeden
As noted in support request 3-3357778341:
- This also affects Linux x86-64, not only Microsoft Windows (XP,Vista)
- This also affects version 5.5.8 (Enterprise), not only 5.1.35, 5.1.38-bzr

Could this still be fixed in 5.5+ or should it target 5.6+?
[8 Oct 2012 15:31] Paul Dubois
Noted in 5.6.8, 5.7.0 changelogs.

For some queries involving ORDER BY, the optimizer chose the wrong
index for accessing the table.