Bug #68072 MySQL 5.1 issue with timediff, group by on multiple columns, join, primary key
Submitted: 12 Jan 2013 18:02 Modified: 13 Jan 2013 15:02
Reporter: Gergely Kutenics Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.1, 5.5 OS:Any (Win64, CentOS Linux)
Assigned to: CPU Architecture:Any
Tags: datetime, GROUP BY, GROUP_BY, join, min, SELECT, timediff

[12 Jan 2013 18:02] Gergely Kutenics
Description:
The attached query works as expected on MySQL 5.0 but returns strange results on 5.1 and 5.5.

I tested this on 7 servers including 5.0.96 and 5.1.67 (both are GA available today on mysql.com), and multiple 5.5.x builds. Enabling ONLY_FULL_GROUP_BY does not affect the behavior. The problem comes up with both InnoDB and MyISAM.

I compared the documentation of 5.0 and 5.1 on select, join, primary key, GROUP_BY, min() datetime, and timediff(), and found no information that explains this.

How to repeat:
drop table if exists test_entry_card_events;
create table test_entry_card_events (card_id int not null, event_time datetime not null, event_type enum('entry','exit') not null,
primary key (card_id,event_time,event_type));

insert into test_entry_card_events (card_id,event_time,event_type) values
(4403,'2012.12.04 12:10','entry'),
(4403,'2012.12.05 12:37','entry'),
(4403,'2012.12.05 19:55','exit'),
(4403,'2012.12.06 12:34','entry'),
(4403,'2012.12.06 18:49','exit'),
(4403,'2012.12.20 13:02','entry'),
(4403,'2012.12.20 18:03','exit'),
(4403,'2012.12.21 10:41','entry'),
(4403,'2012.12.21 17:03','exit');

/* unexpected result on MySQL 5.1, 5.5; correct on MySQL 5.0 */
select a.card_id,a.event_time,min(b.event_time),timediff(min(b.event_time),a.event_time) as d
from test_entry_card_events a join test_entry_card_events b on a.card_id=b.card_id and a.event_type='entry' and b.event_type='exit'
and b.event_time>a.event_time group by a.card_id,a.event_time order by a.card_id,a.event_time;

Suggested fix:
I think that MySQL 5.1 (and above) should do what MySQL 5.0 does.

(There are several workarounds: either is to place the join in a nested select query, another is to group by only one column, a third one is to define a primary key that does not include event_type. The problem does not come up when decimal or double data type and subtraction are used in place of datetime and timediff.)
[12 Jan 2013 19:00] Valeriy Kravchuk
This is what I get with 5.5.29 on Windows:

C:\Program Files\MySQL\MySQL Server 5.5\bin>mysql -uroot -proot -P3312 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 57
Server version: 5.5.29 MySQL Community Server (GPL)

Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> drop table if exists test_entry_card_events;
Query OK, 0 rows affected, 1 warning (0.53 sec)

mysql> create table test_entry_card_events (card_id int not null, event_time dat
etime not null, event_type enum('entry','exit') not null,
    -> primary key (card_id,event_time,event_type));
Query OK, 0 rows affected (1.92 sec)

mysql> insert into test_entry_card_events (card_id,event_time,event_type) values

    -> (4403,'2012.12.04 12:10','entry'),
    -> (4403,'2012.12.05 12:37','entry'),
    -> (4403,'2012.12.05 19:55','exit'),
    -> (4403,'2012.12.06 12:34','entry'),
    -> (4403,'2012.12.06 18:49','exit'),
    -> (4403,'2012.12.20 13:02','entry'),
    -> (4403,'2012.12.20 18:03','exit'),
    -> (4403,'2012.12.21 10:41','entry'),
    -> (4403,'2012.12.21 17:03','exit');
Query OK, 9 rows affected (0.30 sec)
Records: 9  Duplicates: 0  Warnings: 0

mysql> select a.card_id,a.event_time,min(b.event_time),timediff(min(b.event_time
),a.event_time) as d
    -> from test_entry_card_events a join test_entry_card_events b on a.card_id=
b.card_id and a.event_type='entry' and b.event_type='exit'
    -> and b.event_time>a.event_time group by a.card_id,a.event_time order by a.
card_id,a.event_time;
+---------+---------------------+---------------------+------------+
| card_id | event_time          | min(b.event_time)   | d          |
+---------+---------------------+---------------------+------------+
|    4403 | 2012-12-04 12:10:00 | 2012-12-05 19:55:00 | 07:18:00   |
|    4403 | 2012-12-05 12:37:00 | 2012-12-05 19:55:00 | -16:39:00  |
|    4403 | 2012-12-06 12:34:00 | 2012-12-06 18:49:00 | -330:13:00 |
|    4403 | 2012-12-20 13:02:00 | 2012-12-20 18:03:00 | -16:38:00  |
|    4403 | 2012-12-21 10:41:00 | 2012-12-21 17:03:00 | 00:00:00   |
+---------+---------------------+---------------------+------------+
5 rows in set (0.25 sec)

The results above look really wrong. Now EXPLAIN:

mysql> explain select a.card_id,a.event_time,min(b.event_time),timediff(min(b.ev
ent_time),a.event_time) as d
    -> from test_entry_card_events a join test_entry_card_events b on a.card_id=
b.card_id and a.event_type='entry' and b.event_type='exit'
    -> and b.event_time>a.event_time group by a.card_id,a.event_time order by a.
card_id,a.event_time\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
         type: index
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 13
          ref: NULL
         rows: 2
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
         type: ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.a.card_id
         rows: 4
        Extra: Using where; Using index
2 rows in set (0.05 sec)

Now, let's just drop primary key:

mysql> alter table test_entry_card_events drop primary key;
Query OK, 9 rows affected (1.06 sec)
Records: 9  Duplicates: 0  Warnings: 0

mysql> select a.card_id,a.event_time,min(b.event_time),timediff(min(b.event_time
),a.event_time) as d
    -> from test_entry_card_events a join test_entry_card_events b on a.card_id=
b.card_id and a.event_type='entry' and b.event_type='exit'
    -> and b.event_time>a.event_time group by a.card_id,a.event_time order by a.
card_id,a.event_time;
+---------+---------------------+---------------------+----------+
| card_id | event_time          | min(b.event_time)   | d        |
+---------+---------------------+---------------------+----------+
|    4403 | 2012-12-04 12:10:00 | 2012-12-05 19:55:00 | 31:45:00 |
|    4403 | 2012-12-05 12:37:00 | 2012-12-05 19:55:00 | 07:18:00 |
|    4403 | 2012-12-06 12:34:00 | 2012-12-06 18:49:00 | 06:15:00 |
|    4403 | 2012-12-20 13:02:00 | 2012-12-20 18:03:00 | 05:01:00 |
|    4403 | 2012-12-21 10:41:00 | 2012-12-21 17:03:00 | 06:22:00 |
+---------+---------------------+---------------------+----------+
5 rows in set (0.14 sec)

The results above looks correct. But, surely, with index or without, the results should be the same :)
[13 Jan 2013 15:02] Shane Bester
This bug affects 5.1.46 and higher.
5.6 and 5.7 are not affected.
[13 Jan 2013 15:02] Shane Bester
outputs

Attachment: bug68072_versions_test.txt (text/plain), 14.04 KiB.