Bug #15076 index handeled incorrectly if part of a composite index is used
Submitted: 20 Nov 2005 1:39 Modified: 25 Dec 2005 7:27
Reporter: Daniel Kinzler Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0.15-GA OS:pc-solaris2.10 (i386)
Assigned to: CPU Architecture:Any

[20 Nov 2005 1:39] Daniel Kinzler
Description:
For a query involving a column that is left-most in a composite index (myisam), the correct index is used, but it is scanned in full. After creating a separate index for that column, performance is much better. 

How to repeat:
I don't have time for a full reproduction setup right now, but here's an example from when and how it happenes:

Example: SELECT COUNT(*) as c FROM terms WHERE lang = 'de';

> mysql --version
mysql  Ver 14.12 Distrib 5.0.15, for pc-solaris2.10 (i386) using readline 5.0

mysql> show indexes from terms;
+-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name       | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| terms |          0 | lang           |            1 | lang        | A         |         192 |     NULL | NULL   |      | BTREE      |         |
| terms |          0 | lang           |            2 | term        | A         |     1501262 |     NULL | NULL   |      | BTREE      |         |

mysql> explain SELECT COUNT(*) as c FROM terms WHERE lang = 'de';
+----+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
|  1 | SIMPLE      | terms | index | lang          | lang | 275     | NULL | 1129033 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+

mysql> SELECT COUNT(*) as c FROM terms WHERE lang = 'de';
+---------+
| c       |
+---------+
| 1000223 |
+---------+
1 row in set (1 min 19.20 sec)   <------ SLOW!

mysql> create index lang_only on terms (lang);

mysql> explain SELECT COUNT(*) as c FROM terms WHERE lang = 'de';
+----+-------------+-------+------+----------------+-----------+---------+-------+--------+--------------------------+
| id | select_type | table | type | possible_keys  | key       | key_len | ref   | rows   | Extra                    |
+----+-------------+-------+------+----------------+-----------+---------+-------+--------+--------------------------+
|  1 | SIMPLE      | terms | ref  | lang,lang_only | lang_only | 18      | const | 834673 | Using where; Using index |
+----+-------------+-------+------+----------------+-----------+---------+-------+--------+--------------------------+

mysql> SELECT COUNT(*) as c FROM terms WHERE lang = 'de';
+---------+
| c       |
+---------+
| 1003924 |
+---------+
1 row in set (0.86 sec)    <------ FAST!

mysql> SHOW TABLE STATUS;
+-----------+--------+---------+------------+---------+----------------+-------------+------------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| Name      | Engine | Version | Row_format | Rows    | Avg_row_length | Data_length | Max_data_length  | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time          | Collation         | Checksum | Create_options | Comment |
+-----------+--------+---------+------------+---------+----------------+-------------+------------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| terms     | MyISAM |      10 | Dynamic    | 1506195 |             32 |    48366384 |  281474976710655 |    172346368 |         0 |        1506196 | 2005-11-20 01:27:49 | 2005-11-20 01:34:55 | 2005-11-20 01:29:27 | latin1_swedish_ci |     NULL |                |         |
+-----------+--------+---------+------------+---------+----------------+-------------+------------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+

mysql> SHOW CREATE TABLE terms;
  CREATE TABLE `terms` (
    `id` int(12) NOT NULL auto_increment,
    `lang` varchar(16) NOT NULL,
    `term` varchar(255) NOT NULL,
    `type` int(4) NOT NULL,
    PRIMARY KEY  (`id`),
    UNIQUE KEY `lang` (`lang`,`term`),
    KEY `term-lang-type` (`term`,`lang`,`type`),
    KEY `type-term` (`type`,`term`),
    KEY `lang_only` (`lang`),
    FULLTEXT KEY `term_fulltext` (`term`)
) ENGINE=MyISAM DEFAULT CHARSET=latin
[20 Nov 2005 12:08] Valeriy Kravchuk
Thank you for a problem report. What is really strange for me is not only "index" value in the type field, but also the key length used for "lang" index - looks like server decided to use both columns. And in my tests on a small number of rows I've got a different result:

mysql> explain SELECT COUNT(*) as c FROM terms WHERE lang = 'ru';
+----+-------------+-------+------+----------------+------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys  | key  | key_len | ref   | rows | Extra                    |
+----+-------------+-------+------+----------------+------+---------+-------+------+--------------------------+
|  1 | SIMPLE      | terms | ref  | lang,lang_only | lang | 18      | const | 1 | Using where; Using index |
+----+-------------+-------+------+----------------+------+---------+-------+------+--------------------------+
1 row in set (0.00 sec)

So, can you, please, try to create a test case with a smaller number of rows showing the same selection of optimizer? Have you analyzed your table?
[21 Nov 2005 14:18] Mark Leith
Hi Valeriy,

Just a quick note as it was I that asked Daniel to file this as a bug, after walking him through his issue on Freenode's #mysql channel. I asked Daniel to run an ANALYZE, and this did not effect the optimizer's choice to use both columns within the index for the scan.

Adding a second, non-composite, index "resolved" the issue, however, only the left most column within the composite index should have been used. 

Over to Daniel to try this on a subset of the data as you have asked.

Mark
[25 Nov 2005 2:37] Daniel Kinzler
I have been unable to reproduce the problem myself... I had tried creating another table with the exact same create statement and copying data over - the new table didn't have the problem. I did not copy all data, though.

In the mean time, I have lost the original dataset due to several crashes (see support issue #7378 (not the bug with that id!), that was the same box). I have since started to re-fill the table, and the problem did not occur again. I'm not sure if I should be happy or worried about this.

Anyway, I don't have nearly as much data in there yes as before... maybe that's the trigger. Or maybe it was due to corrupted index data caused by previous crashes. Analyze table did not help, though. Oh, another detail: I originally had the same problem with another table in the same database.

Anyway, I'll let you know if it happens again. Thanks for your prompt response!
[25 Nov 2005 7:27] Valeriy Kravchuk
Thank you for the additional checks. Please, reopen this report if you'll see similar optimizer behaviour again on any table.
[26 Dec 2005 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".