MySQL Bugs: #67885: Two column index isn't used, when adding a char column to table

Bug #67885	Two column index isn't used, when adding a char column to table
Submitted:	12 Dec 2012 17:43	Modified:	9 Jan 2013 17:40
Reporter:	Olaf Dietsche	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	5.5.24, 5.5.30, 5.7.1, 5.1.68, 5.0.97	OS:	Linux (Ubuntu 12.04)
Assigned to:		CPU Architecture:	Any

Description:
A two column index is not used anymore, when an additional char column is added to the table.

Whithout the additional column the index is used.

explain select * from track t, countries c where t.ip >= c.ipfrom and t.ip <= c.ipto

gives

+----+-------------+-------+-------+---------------+---------------+---------+------+------+---------------------------------------------+
| id | select_type | table | type  | possible_keys | key           | key_len | ref  | rows | Extra                                       |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+---------------------------------------------+
|  1 | SIMPLE      | t     | index | ip_index      | ip_index      | 5       | NULL |    1 | Using index                                 |
|  1 | SIMPLE      | c     | index | from_to_index | from_to_index | 8       | NULL |    1 | Using where; Using index; Using join buffer |
+----+-------------+-------+-------+---------------+---------------+---------+------+------+---------------------------------------------+

Adding a char column to countries gives however

+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------------+
| id | select_type | table | type  | possible_keys | key      | key_len | ref  | rows | Extra                          |
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------------+
|  1 | SIMPLE      | t     | index | ip_index      | ip_index | 5       | NULL |    1 | Using index                    |
|  1 | SIMPLE      | c     | ALL   | from_to_index | NULL     | NULL    | NULL |    1 | Using where; Using join buffer |
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------------+

I would expect, that from_to_index would be used in both cases.

How to repeat:
Without char column:

create table track (
  ip int,
  index ip_index (ip));

create table countries (
  ipfrom int not null,
  ipto int not null,
  index from_to_index (ipfrom, ipto));

explain select * from track t, countries c where t.ip >= c.ipfrom and t.ip <= c.ipto;

See also: http://www.sqlfiddle.com/#!2/ae4c5/1/0

*With* char column:

create table track (
  ip int,
  index ip_index (ip));

create table countries (
  ipfrom int not null,
  ipto int not null,
  country char(2) not null,
  index from_to_index (ipfrom, ipto));

explain select * from track t, countries c where t.ip >= c.ipfrom and t.ip <= c.ipto;

See also: http://www.sqlfiddle.com/#!2/97fcd/2/0

The SQL Fiddle examples run with version 5.5.28 instead of 5.5.24, but the result is the same.

Thank you for the report.

Verified as described.

test case for MTR

Attachment: bug67885.test (application/octet-stream, text), 895 bytes.

Not a bug.

In the former case, the 'from_to_index' index is covering so an index scan
can be used to read countries.* (you can see that Extra has "Using index"
which means covering index). In the latter case, the index is not covering
and can therefore not be used to read countries.*. For this reason, table
scan is used instead of index scan.

When I add country to from_to_index, the index is used in the second example as well. But ...

- I don't understand this restriction.
- Where is it documented, that an index is used only when it is covering?

Hi Olaf,

It's not a question of whether or not the index can be used by the query, but whether or not it is *efficient* to use that index. Looking at the EXPLAIN in the original post you see that the index is listed in the possible_keys column. But using an index that does not filter rows (in the first EXPLAIN it does not filter rows, it is just used to retrieve the necessary column data) and then look up the values in the table afterwards is just more work than scanning the table in the first place.

For you query, all rows in country have to be read. Now consider these ways of accessing the rows:

Covering index:
Read all index rows in sequence and return each row as they are read. The number of index rows read is the same as the number of rows in the table.

Non-covering index:
Read all index rows in sequence, and for each of these rows do a lookup in the table. The number of rows read = the number of index rows (=rows in the table) + a lookup in the table for each of these = 2x rows in the table. Even worse, half of these reads require random read from the disk.

Table scan: 
Read all rows from the table in sequence and return each row as they are read. The number of rows read = the number of rows in the table.

So, MySQL considers the options and decides to do index scan if a covering index is available. If not an option, table scan is much cheaper than index scan of a non-covering index.

If you add predicates that can be used to perform range access or direct lookup access ('ref') in the index, or if you want an ordered result that the index can provide, MySQL may of course choose to use a non-clustering index.

Thank you for this detailed explanation.