Bug #16377 [BETWEEN] Weird issue when selecting records between two hard-coded dates
Submitted: 11 Jan 2006 15:20 Modified: 2 May 2007 19:08
Reporter: Wim W.A. ten Brink Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.0.16 and 5.0.18/5.0.19/4.1.17 BK OS:Microsoft Windows (Windows XP/Linux)
Assigned to: Evgeny Potemkin CPU Architecture:Any
Tags: bfsm_2006_12_21

[11 Jan 2006 15:20] Wim W.A. ten Brink
Description:
I discovered this problem purely by accident and have absolutely no explanation for what causes it. I am trying to get a recordset for data between two dates. So I do:
> select * from calllog where ( DATE(TimeOfCall) between "2006-1-11" AND "2006-1-11" )
But I don't get any data. So after adding one additional 0 to the first date-field I get this query:
> select * from calllog where ( DATE(TimeOfCall) between "2006-01-11" AND "2006-1-11" )
And by some kind of miracle, I now get the data I was looking for.

But things are getting weirder...

I then decided to check the same query with older data and used this:
> select * from calllog where ( DATE(TimeOfCall) between "2005-1-11" AND "2005-1-11" )
The only difference with the first query is the year. (2005 instead of 2006.) If the error is consistent then this too would give no data. I was absolutely sure about it. And I was wrong! I got data!

Conclusion, I discovered that this problem only exists for any date in the year 2006 where the day or month is only one digit. Even more confusing is that it only applies to the first value in the BETWEEN statement. The second date value can use one or more digits.

How to repeat:
Failing query: select * from calllog where ( DATE(TimeOfCall) between "2006-1-11" AND "2006-1-11" ) 
Succesful query: select * from calllog where ( DATE(TimeOfCall) between "2006-01-11" AND "2006-1-11" )

Partial Table definition:
CREATE TABLE  calllog (
  id int(11) unsigned NOT NULL auto_increment,
  TimeOfCall datetime NOT NULL default '0000-00-00 00:00:00',
  CallerID varchar(255) NOT NULL default '',
  CalledID varchar(255) NOT NULL default '',
  ... // More fields
  PRIMARY KEY  (id),
  INDEX id_TimeOfCall USING BTREE (TimeOfCall),
  ... // More indices
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Suggested fix:
I solved this problem in my PHP code by simply adding one additional '0' in front of the month and day fields. That this could result in something like "2006-012-031" doesn't seem to bother MySQL.
[11 Jan 2006 15:24] Wim W.A. ten Brink
I consider this a very serious issue because it's such unusual behaviour and because it took me hours of testing before I discovered this weird behaviour. If I hadn't been suspicious about the BETWEEN statement then I don't know how long it would have taken me to find the cause bvefore giving up on it.
It is especially nasty since it just returns no records and thus users might think there's no data. It's a rare situation, of course, that data is searched between two dates but that makes this problem even more serious, not less serious.
[11 Jan 2006 15:52] Godofredo Miguel Solorzano
Could you please provide a complete test script, with table create,
with some inserts and query.

Thanks in advance.
[11 Jan 2006 19:22] Wim W.A. ten Brink
Very simple test script.

Attachment: Test 20060111 2020.sql (text/plain), 1.97 KiB.

[11 Jan 2006 19:29] Wim W.A. ten Brink
A file has been added containing a very simple database. It's a test database that I quickly set up and exported. (I am not at my work so I don't have the original script here.)
At home I also use 5.0.16 and this version has the same flaw as the one I use at work. If you look at it, you will notice that it only has one table with two fields: one unique ID and a timestamp. I quickly filled it with repeating one insert query a couple of times. Only two dates are in it, but they appear multiple times: '2005-10-10 00:00:00' and '2006-01-01 00:00:00'

Now, I try this query:
SELECT * FROM calllog where (Date(TimeOfCall) between "2006-01-01" and "2007-1-1")
And it provides me the records with ID 8 to 14. Just what I expected. But then this query:
SELECT * FROM calllog where (Date(TimeOfCall) between "2006-1-01" and "2007-1-1")
And this query provides me no data. And neither does:
SELECT * FROM calllog where (Date(TimeOfCall) between "2006-01-1" and "2007-1-1")
provide me the data I would like to get.

But if I go back a year more and use this:
SELECT * FROM calllog where (Date(TimeOfCall) between "2005-1-01" and "2007-1-1")
And for whatever reason I do get all the data again.

So the flaw is caused by a combination of the year (2006) with a single-digit month and/or day. It's actually quite easy to repeat, as I discovered.
[11 Jan 2006 22:07] Godofredo Miguel Solorzano
Thank you for the bug report and feedback.

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2006-01-01" and
    -> "2007-1-1");
+----+---------------------+
| id | TimeOfCall          |
+----+---------------------+
|  8 | 2006-01-01 00:00:00 |
|  9 | 2006-01-01 00:00:00 |
| 10 | 2006-01-01 00:00:00 |
| 11 | 2006-01-01 00:00:00 |
| 12 | 2006-01-01 00:00:00 |
| 13 | 2006-01-01 00:00:00 |
| 14 | 2006-01-01 00:00:00 |
+----+---------------------+
7 rows in set (0.01 sec)

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2006-1-01" and
    -> "2007-1-1")
    -> ;
Empty set (0.00 sec)

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2005-1-01" and "2007-1-1");
+----+---------------------+
| id | TimeOfCall          |
+----+---------------------+
|  1 | 2005-10-10 00:00:00 |
|  2 | 2005-10-10 00:00:00 |
|  3 | 2005-10-10 00:00:00 |
|  4 | 2005-10-10 00:00:00 |
|  5 | 2005-10-10 00:00:00 |
|  6 | 2005-10-10 00:00:00 |
|  7 | 2005-10-10 00:00:00 |
|  8 | 2006-01-01 00:00:00 |
|  9 | 2006-01-01 00:00:00 |
| 10 | 2006-01-01 00:00:00 |
| 11 | 2006-01-01 00:00:00 |
| 12 | 2006-01-01 00:00:00 |
| 13 | 2006-01-01 00:00:00 |
| 14 | 2006-01-01 00:00:00 |
+----+---------------------+
14 rows in set (0.00 sec)

mysql> select version();
+------------------+
| version()        |
+------------------+
| 4.1.17-debug-log |
+------------------+
1 row in set (0.00 sec)
------------------------------------------------------------------------------
mysql> select version();
+--------------+
| version()    |
+--------------+
| 5.0.19-debug |
+--------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2006-01-01" and
    -> "2007-1-1")
    -> ;
+----+---------------------+
| id | TimeOfCall          |
+----+---------------------+
|  8 | 2006-01-01 00:00:00 |
|  9 | 2006-01-01 00:00:00 |
| 10 | 2006-01-01 00:00:00 |
| 11 | 2006-01-01 00:00:00 |
| 12 | 2006-01-01 00:00:00 |
| 13 | 2006-01-01 00:00:00 |
| 14 | 2006-01-01 00:00:00 |
+----+---------------------+
7 rows in set (0.00 sec)

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2006-1-01" and
    -> "2007-1-1");
Empty set (0.01 sec)

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2005-1-01" and "2007-1-1");
+----+---------------------+
| id | TimeOfCall          |
+----+---------------------+
|  1 | 2005-10-10 00:00:00 |
|  2 | 2005-10-10 00:00:00 |
|  3 | 2005-10-10 00:00:00 |
|  4 | 2005-10-10 00:00:00 |
|  5 | 2005-10-10 00:00:00 |
|  6 | 2005-10-10 00:00:00 |
|  7 | 2005-10-10 00:00:00 |
|  8 | 2006-01-01 00:00:00 |
|  9 | 2006-01-01 00:00:00 |
| 10 | 2006-01-01 00:00:00 |
| 11 | 2006-01-01 00:00:00 |
| 12 | 2006-01-01 00:00:00 |
| 13 | 2006-01-01 00:00:00 |
| 14 | 2006-01-01 00:00:00 |
+----+---------------------+
14 rows in set (0.01 sec)
[12 Jan 2006 12:44] Wim W.A. ten Brink
I have to correct something though. As I said earlier, I also get data back if I use a date "2006-01-012" where the day is a three-digit number. But this date gets converted to januari 2 and not januari 12 as I expected. Thus my PHP code needs to make sure month and day are exactly two digits. (Which I thus modified to support this.)

My current table is filled with lots of records with timestamps from december 22, 2005 to januari 12, 2006. I thus tested the following SQL commands:

SELECT Min(Date(TimeOfCall)), Max(Date(TimeOfCall)) FROM calllog where (Date(TimeOfCall) between "2006-1-10" and "2006-1-11")
+-----------------------+-----------------------+
| Min(Date(TimeOfCall)) | Max(Date(TimeOfCall)) |
+-----------------------+-----------------------+
| NULL                  | NULL                  |
+-----------------------+-----------------------+

SELECT Min(Date(TimeOfCall)), Max(Date(TimeOfCall)) FROM calllog where (Date(TimeOfCall) between "2006-1-10" and "2006-01-11")
+-----------------------+-----------------------+
| Min(Date(TimeOfCall)) | Max(Date(TimeOfCall)) |
+-----------------------+-----------------------+
| NULL                  | NULL                  |
+-----------------------+-----------------------+

SELECT Min(Date(TimeOfCall)), Max(Date(TimeOfCall)) FROM calllog where (Date(TimeOfCall) between "2006-01-10" and "2006-1-11")
+-----------------------+-----------------------+
| Min(Date(TimeOfCall)) | Max(Date(TimeOfCall)) |
+-----------------------+-----------------------+
| 2006-01-10            | 2006-01-12            |
+-----------------------+-----------------------+

SELECT Min(Date(TimeOfCall)), Max(Date(TimeOfCall)) FROM calllog where (Date(TimeOfCall) between "2006-01-10" and "2006-01-11")
+-----------------------+-----------------------+
| Min(Date(TimeOfCall)) | Max(Date(TimeOfCall)) |
+-----------------------+-----------------------+
| 2006-01-10            | 2006-01-11            |
+-----------------------+-----------------------+

Do note that only the last query returns the correct result!

Testing it with older data has the same result. I get the same problem in 2005, 2004, etc. as long as there is valid data for that period.
Some more testing:

SELECT distinct Date(TimeOfCall) FROM calllog where (Date(TimeOfCall) = "2006-1-10")
+------------------+
| Date(TimeOfCall) | 
+------------------+
(No resultset)

SELECT distinct Date(TimeOfCall) FROM calllog where (Date(TimeOfCall) = "2006-01-10")
+------------------+
| Date(TimeOfCall) | 
+------------------+
| 2006-01-10       | 
+------------------+

I now wonder if this is really a bug since MySQL expects dates i the "YYYY-MM-DD" format, but I would expect an error message from MySQL in these cases. As it appears now, MySQL is just ignoring this error and returning a bit more random data instead.
[12 Jan 2006 17:22] Godofredo Miguel Solorzano
mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2006-1-01" and
    -> "2007-1-1");
Empty set (0.00 sec)

mysql>
mysql> SELECT * FROM calllog where TimeOfCall between "2006-1-01" and
    -> "2007-1-1";
+----+---------------------+
| id | TimeOfCall          |
+----+---------------------+
|  8 | 2006-01-01 00:00:00 |
|  9 | 2006-01-01 00:00:00 |
| 10 | 2006-01-01 00:00:00 |
| 11 | 2006-01-01 00:00:00 |
| 12 | 2006-01-01 00:00:00 |
| 13 | 2006-01-01 00:00:00 |
| 14 | 2006-01-01 00:00:00 |
+----+---------------------+
7 rows in set (0.00 sec)
[1 May 2006 11:27] Evgeny Potemkin
The date() type conversion is performed by Item_date_typecast(). After it dates are compared as strings which results in missing rows.
[7 May 2006 12:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/6065
[12 May 2006 14:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/6305
[13 May 2006 14:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/6338
[31 May 2006 20:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7117
[13 Jun 2006 19:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7584
[15 Jun 2006 17:11] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7712
[16 Jun 2006 23:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7790
[19 Jun 2006 0:12] Evgeny Potemkin
Fixed in 4.1.21, 5.0.23, 5.1.12
[20 Jun 2006 10:52] Evgeny Potemkin
All date/time functions has the STRING result type thus their results are
compared as strings. The string date representation allows a user to skip 
some of leading zeros. This can lead to wrong comparison result if a date/time 
function result is compared to such a string constant.
[4 Jul 2006 8:57] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://www.mysql.com/doc/en/Installing_source_tree.html
[4 Jul 2006 8:58] Jon Stephens
Documented bugfix in 4.1.21/5.0.23/5.1.12 changelogs. Closed.
[13 Aug 2006 18:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10328

ChangeSet@1.2252, 2006-08-13 22:22:27+04:00, evgen@moonbone.local +3 -0
  Fixed bug#15950: NOW() optimized away in VIEWs
  
  This bug is a side-effect of fix for bug#16377. NOW() is optimized in
  BETWEEN to integer constants to speed up query execution. When view is being
  created it saves already modified query and thus became wrong.
  
  The agg_cmp_type() function now substitutes constant result DATE/TIME functions 
  with their result only if the current query isn't CREATE VIEW or SHOW CREATE
  VIEW.
[7 Dec 2006 22:33] Chris Calender
Re-opening.  This was fixed in 4.1.21, but it has reappeared in 4.1.22.

Using test case already provided here:

mysql> select version();
+-------------------------+
| version()               |
+-------------------------+
| 4.1.22-community-nt-log |
+-------------------------+
1 row in set (0.00 sec)

mysql> DROP TABLE IF EXISTS `calllog`;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> CREATE TABLE `calllog` (
    ->   `id` int(11) unsigned NOT NULL auto_increment,
    ->   `TimeOfCall` datetime NOT NULL default '0000-00-00 00:00:00',
    ->   PRIMARY KEY  (`id`)
    -> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.09 sec)

mysql> INSERT INTO `calllog` (`id`,`TimeOfCall`) VALUES
    ->  (1,'2005-10-10 00:00:00'),
    ->  (2,'2005-10-10 00:00:00'),
    ->  (3,'2005-10-10 00:00:00'),
    ->  (4,'2005-10-10 00:00:00'),
    ->  (5,'2005-10-10 00:00:00'),
    ->  (6,'2005-10-10 00:00:00'),
    ->  (7,'2005-10-10 00:00:00'),
    ->  (8,'2006-01-01 00:00:00'),
    ->  (9,'2006-01-01 00:00:00'),
    ->  (10,'2006-01-01 00:00:00'),
    ->  (11,'2006-01-01 00:00:00'),
    ->  (12,'2006-01-01 00:00:00'),
    ->  (13,'2006-01-01 00:00:00'),
    ->  (14,'2006-01-01 00:00:00');
Query OK, 14 rows affected (0.00 sec)
Records: 14  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2006-01-01" and
    -> "2007-1-1");
+----+---------------------+
| id | TimeOfCall          |
+----+---------------------+
| 8  | 2006-01-01 00:00:00 |
| 9  | 2006-01-01 00:00:00 |
| 10 | 2006-01-01 00:00:00 |
| 11 | 2006-01-01 00:00:00 |
| 12 | 2006-01-01 00:00:00 |
| 13 | 2006-01-01 00:00:00 |
| 14 | 2006-01-01 00:00:00 |
+----+---------------------+
7 rows in set (0.05 sec)

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2006-1-01" and
    -> "2007-1-1");
Empty set (0.00 sec)

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2006-01-1" and
    -> "2007-1-1");
Empty set (0.00 sec)

--

I'm guessing this may have just gotten left out of 4.1.22.

I haven't checked 5.0.x yet, but will do that next.
[7 Dec 2006 22:53] Chris Calender
BTW, the last 2 queries from above should have returned the same results as the first query (as it does in 4.1.21).

Also, I tested this with 5.0.30, and I also see the buggy behavior:

mysql> select version();
+--------------------------+
| version()                |
+--------------------------+
| 5.0.30-enterprise-gpl-nt |
+--------------------------+
1 row in set (0.00 sec)

mysql> DROP TABLE IF EXISTS `calllog`;
Query OK, 0 rows affected, 1 warning (0.02 sec)

mysql> CREATE TABLE `calllog` (
    ->   `id` int(11) unsigned NOT NULL auto_increment,
    ->   `TimeOfCall` datetime NOT NULL default '0000-00-00 00:00:00',
    ->   PRIMARY KEY  (`id`)
    -> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.08 sec)

mysql> INSERT INTO `calllog` (`id`,`TimeOfCall`) VALUES
    ->  (1,'2005-10-10 00:00:00'),
    ->  (2,'2005-10-10 00:00:00'),
    ->  (3,'2005-10-10 00:00:00'),
    ->  (4,'2005-10-10 00:00:00'),
    ->  (5,'2005-10-10 00:00:00'),
    ->  (6,'2005-10-10 00:00:00'),
    ->  (7,'2005-10-10 00:00:00'),
    ->  (8,'2006-01-01 00:00:00'),
    ->  (9,'2006-01-01 00:00:00'),
    ->  (10,'2006-01-01 00:00:00'),
    ->  (11,'2006-01-01 00:00:00'),
    ->  (12,'2006-01-01 00:00:00'),
    ->  (13,'2006-01-01 00:00:00'),
    ->  (14,'2006-01-01 00:00:00');
Query OK, 14 rows affected (0.02 sec)
Records: 14  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2006-01-01" and
    -> "2007-1-1");
+----+---------------------+
| id | TimeOfCall          |
+----+---------------------+
| 8  | 2006-01-01 00:00:00 |
| 9  | 2006-01-01 00:00:00 |
| 10 | 2006-01-01 00:00:00 |
| 11 | 2006-01-01 00:00:00 |
| 12 | 2006-01-01 00:00:00 |
| 13 | 2006-01-01 00:00:00 |
| 14 | 2006-01-01 00:00:00 |
+----+---------------------+
7 rows in set (0.01 sec)

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2006-1-01" and
    -> "2007-1-1");
Empty set (0.02 sec)

mysql> SELECT * FROM calllog where (Date(TimeOfCall) between "2006-01-1" and
    -> "2007-1-1");
Empty set (0.00 sec)
[12 Dec 2006 11:55] Evgeny Potemkin
The patch for this bug was removed. This bug isn't an easy one and the correct fix will significantly affect the server internals. Thus, in order to preserve stability the correct fix will be available only in 5.2+.

In order to be compared correctly dates like '2006-1-1', '2006-01-1', etc should
be casted to dates like 'BETWEEN CAST('2006-01-1' AS DATE) AND CAST('2007-1-1')'.
[26 Apr 2007 20:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/25561

ChangeSet@1.2440, 2007-04-27 00:40:35+04:00, evgen@moonbone.local +6 -0
  Bug#16377: Wrong DATE/DATETIME comparison in BETWEEN function.
  
  The BETWEEN function was comparing DATE/DATETIME values either as ints or as
  strings. Both methods have their disadvantages and may lead to a wrong
  result.
  
  Now BETWEEN function checks whether all of its arguments has the STRING result
  types and at least one of them is a DATE/DATETIME item. If so it sets up
  two Arg_comparator obects to compare with the compare_datetime() comparator
  and uses them to compare such items.
  
  Added two Arg_comparator object members and one flag to the
  Item_func_between class for the correct DATE/DATETIME comparison.
  The Item_func_between::fix_length_and_dec() function now detects whether
  it's used for DATE/DATETIME comparison and sets up newly added Arg_comparator
  objects to do this.
  The Item_func_between::val_int() now uses Arg_comparator objects to perform
  correct DATE/DATETIME comparison.
  The owner variable of the Arg_comparator class now can be set to NULL if the
  caller wants to handle NULL values by itself.
  Now the Item_date_add_interval::get_date() function ajusts cached_field type according to the detected type.
[30 Apr 2007 12:58] Bugs System
Pushed into 5.0.42
[30 Apr 2007 12:59] Bugs System
Pushed into 5.1.18-beta
[1 May 2007 17:51] Paul Dubois
This bug was originally fixed in 4.1, 5.0, and 5.1.
The reappearance of the bug, according to the later
comments, was fixed only in 5.0 and 5.1.  Was this
bug not re-fixed in 4.1?
[2 May 2007 13:10] Evgeny Potemkin
The previous fix for this bug was removed from all versions as it made some things too complex and the bug was moved to 'to be fixed later' status. The new patch doesn't affect internals as much as the previous one and is pushed to 5.0+ only.
[2 May 2007 19:08] Paul Dubois
Noted in 5.0.42, 5.1.18 changelogs.

The omission of leading zeros in dates could lead to erroneous 
results when these were compared with the output of certain date and
time functions.

The patch in 4.1.21 was reverted in 4.1.22. I have added a note
to that effect in the 4.1.22 changelog.
[4 Jun 2007 21:21] Bugs System
Pushed into 5.1.20-beta
[4 Jun 2007 21:23] Bugs System
Pushed into 5.0.44
[14 Jun 2007 19:01] Bugs System
Pushed into 5.0.44
[14 Jun 2007 19:01] Bugs System
Pushed into 5.1.20-beta