Bug #58190 BETWEEN no longer uses indexes for date or datetime fields
Submitted: 15 Nov 2010 3:13 Modified: 11 Dec 2010 17:23
Reporter: chris valaas Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S2 (Serious)
Version:5.5.7, 5.5.8 OS:Any (Solaris, Mac OS X, Linux)
Assigned to: Alexander Barkov CPU Architecture:Any
Tags: regression

[15 Nov 2010 3:13] chris valaas
Description:
queries using BETWEEN involving (at least) DATE or DATETIME fields that are indexed will not use the index. 

Rewriting the query to use >= and <= does use the index.

How to repeat:
mysql> CREATE TABLE `date_index_test` (
  `id` int(11) DEFAULT NULL,
  `date_column` date DEFAULT NULL,
  KEY `date_column` (`date_column`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.28 sec)

mysql> insert into date_index_test values (1,'2010-09-01'),(2,'2010-10-01');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> explain select * from date_index_test where date_column between '2010-09-01' and '2010-10-01';
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table           | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | date_index_test | ALL  | date_column   | NULL | NULL    | NULL |    2 | Using where |
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> explain select * from date_index_test where date_column >= '2010-09-01' and date_column <='2010-10-01';
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table           | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
|  1 | SIMPLE      | date_index_test | range | date_column   | date_column | 4       | NULL |    2 | Using where |
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
1 row in set (0.00 sec)
[15 Nov 2010 5:08] Valeriy Kravchuk
Indeed, index is not used, even if table is MyISAM (so it is not about statistics probably):

macbook-pro:5.5 openxs$ bin/mysql -A -uroot test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.5.7-rc-debug Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE TABLE `date_index_test` (
    ->   `id` int(11) DEFAULT NULL,
    ->   `date_column` date DEFAULT NULL,
    ->   KEY `date_column` (`date_column`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.19 sec)

mysql>  insert into date_index_test values (1,'2010-09-01'),(2,'2010-10-01');
Query OK, 2 rows affected (0.03 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql>  explain select * from date_index_test where date_column between '2010-09-01' and
    -> '2010-10-01';
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table           | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | date_index_test | ALL  | date_column   | NULL | NULL    | NULL |    2 | Using where |
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.01 sec)

mysql>  explain select * from date_index_test where date_column >= '2010-09-01' and
    -> date_column <='2010-10-01';
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table           | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
|  1 | SIMPLE      | date_index_test | range | date_column   | date_column | 4       | NULL |    2 | Using where |
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
1 row in set (0.36 sec)

mysql> alter table date_index_test engine=MyISAM;
Query OK, 2 rows affected (0.49 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> explain select * from date_index_test where date_column between '2010-09-01' and '2010-10-01';
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table           | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | date_index_test | ALL  | date_column   | NULL | NULL    | NULL |    2 | Using where |
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

But for a range that includes all rows in the table it may be even not bad. Had you tried to add more rows?
[15 Nov 2010 5:34] chris valaas
We have tables with several million rows and selects using BETWEEN on just a tiny portion still do not use the index. This seems have have started in 5.5.*.
[15 Nov 2010 7:09] Valeriy Kravchuk
Indeed, it works as expected in 5.1.52:

mysql> select version();
+------------------+
| version()        |
+------------------+
| 5.1.52-community |
+------------------+
1 row in set (0.01 sec)

mysql> CREATE TABLE `date_index_test` (
    ->   `id` int(11) DEFAULT NULL,
    ->   `date_column` date DEFAULT NULL,
    ->   KEY `date_column` (`date_column`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    16
Current database: test

Query OK, 0 rows affected (0.66 sec)

mysql>  insert into date_index_test values (1,'2010-09-01'),(2,'2010-10-01');
Query OK, 2 rows affected (0.03 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> explain select * from date_index_test where date_column between '2010-09-
01' and
    -> '2010-10-01';
+----+-------------+-----------------+-------+---------------+-------------+----
-----+------+------+-------------+
| id | select_type | table           | type  | possible_keys | key         | key
_len | ref  | rows | Extra       |
+----+-------------+-----------------+-------+---------------+-------------+----
-----+------+------+-------------+
|  1 | SIMPLE      | date_index_test | range | date_column   | date_column | 4
     | NULL |    1 | Using where |
+----+-------------+-----------------+-------+---------------+-------------+----
-----+------+------+-------------+
1 row in set (0.08 sec)
[18 Nov 2010 9:25] Valeriy Kravchuk
Same problem with current mysql-5.5-security tree on Ubuntu 10.04:

openxs@ubuntu:~/dbs/5.5-sec$ bin/mysql --no-defaults -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.8-rc Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE TABLE `date_index_test` (
    ->   `id` int(11) DEFAULT NULL,
    ->   `date_column` date DEFAULT NULL,
    ->   KEY `date_column` (`date_column`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.10 sec)

mysql>  insert into date_index_test values (1,'2010-09-01'),(2,'2010-10-01');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> explain select * from date_index_test where date_column between '2010-09-01' and
    -> '2010-10-01';
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table           | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | date_index_test | ALL  | date_column   | NULL | NULL    | NULL |    2 | Using where |
+----+-------------+-----------------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> explain select * from date_index_test where date_column >= '2010-09-01' and
    -> date_column <='2010-10-01';
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table           | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
|  1 | SIMPLE      | date_index_test | range | date_column   | date_column | 4       | NULL |    2 | Using where |
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
1 row in set (0.00 sec)
[18 Nov 2010 11:20] Øystein Grøvlen
This seems to be a character set issue.  If I set default character set to latin1, index is used.  That is,

mysql> set NAMES latin1;
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from date_index_test where date_column between '2010-09-01' and '2010-10-01';
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table           | type  | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
|  1 | SIMPLE      | date_index_test | range | date_column   | date_column | 4       | NULL |    1 | Using where |
+----+-------------+-----------------+-------+---------------+-------------+---------+------+------+-------------+
[18 Nov 2010 12:31] Øystein Grøvlen
It is decided that index cannot be used because the following test in get_mm_leaf() is true:

((Field_str*)field)->charset() != conf_func->compare_collation()

LHS is latin1, RHS is UTF8
[19 Nov 2010 15:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/124480

3137 Alexander Barkov	2010-11-19
      Bug#58190 BETWEEN no longer uses indexes for date or datetime fields
      
      Regression introduced by WL#2649.
      
      Problem: queries with date/datetime columns did not use indexes:
      
      set names non_latin1_charset;
      select * from date_index_test
      where date_column between '2010-09-01' and '2010-10-01';
      
      before WL#2649 indexes worked fine because charset of date/datetime
      columns was BINARY which always won.
      
      Fix: testing that collation of the operation matches collation of the field
      is only needed in case of "real" string data types.
      For DATE, DATETIME it's not needed.
      
      Note, the problem remains with UCS2, UTF16, UTF32 (index is not used).
      But this is not a regressions, as in 5.1 comparison between
      DATE/DATETIME and UCS2/UTF16/UTF32 did not work at all (always returned FALSE).
      
        @ mysql-test/include/ctype_numconv.inc
        @ mysql-test/r/ctype_binary.result
        @ mysql-test/r/ctype_cp1251.result
        @ mysql-test/r/ctype_latin1.result
        @ mysql-test/r/ctype_ucs.result
        @ mysql-test/r/ctype_utf8.result
        Adding tests
      
        @ sql/field.h
        Adding new method to distinguish between "real string" types
        like CHAR, VARCHAR, TEXT, and "almost string" types, like DATE, DATETIME.
      
        @ sql/opt_range.cc
        Checking collation only for "real string" types.
[19 Nov 2010 16:40] Alexander Barkov
The second part of this problem is reported as a separate bug:

Bug#58329 Bug#58190 BETWEEN does not use indexes for date or datetime fields for UCS2+

(which is not a regression)
[19 Nov 2010 17:33] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/124502

3138 Alexander Barkov	2010-11-19
      Bug#58190 BETWEEN no longer uses indexes for date or datetime fields
      Regression introduced by WL#2649.
      
      Problem: queries with date/datetime columns did not use indexes:
      set names non_latin1_charset;
      select * from date_index_test
      where date_column between '2010-09-01' and '2010-10-01';
      
      before WL#2649 indexes worked fine because charset of 
      date/datetime
      columns was BINARY which always won.
      
      Fix: testing that collation of the operation matches collation 
      of the field is only needed in case of "real" string data types.
      For DATE, DATETIME it's not needed.
      
      
        @ mysql-test/include/ctype_numconv.inc
        @ mysql-test/r/ctype_binary.result
        @ mysql-test/r/ctype_cp1251.result
        @ mysql-test/r/ctype_latin1.result
        @ mysql-test/r/ctype_ucs.result
        @ mysql-test/r/ctype_utf8.result
        Adding tests
      
        @ sql/field.h
        Adding new method Field_str::match_collation_to_optimize_range()
        for use in opt_range.cc to distinguish between
        "real string" types like CHAR, VARCHAR, TEXT
        (Field_string, Field_varstring, Field_blob)
      
        and "almost string" types DATE, TIME, DATETIME
        (Field_newdate, Field_datetime, Field_time, Field_timestamp)
      
        @ sql/opt_range.cc
        Using new method instead of checking result_type() against STRING result.
      
      Note:
      
        Another part of this problem (which is not regression) 
        is submitted separately (see bug##58329).
[19 Nov 2010 19:53] Alexander Barkov
Pushed into mysql-5.5-bugteam [5.5.8]
Pushed into mysql-trunk-bugteam [5.6.1-m5]
[5 Dec 2010 12:42] Bugs System
Pushed into mysql-trunk 5.6.1 (revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (version source revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (merge vers: 5.6.1) (pib:23)
[11 Dec 2010 17:23] Paul DuBois
Noted in 5.5.8 changelog.

BETWEEN did not use indexes for DATE or DATETIME columns.
[16 Dec 2010 22:33] Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)