Bug #50190 | MAX() and MIN() failing for time values with Innodb | ||
---|---|---|---|
Submitted: | 8 Jan 2010 16:39 | Modified: | 8 Feb 2010 11:33 |
Reporter: | Patrick Crews | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 6.0-codebase,pe,next-mr | OS: | Any |
Assigned to: | Assigned Account | CPU Architecture: | Any |
Tags: | Aggregate Function, innodb, MAX, regression |
[8 Jan 2010 16:39]
Patrick Crews
[8 Jan 2010 16:43]
Patrick Crews
EXPLAIN output: * Innodb: EXPLAIN SELECT MAX( `col_time_key` ) , `col_varchar_key` field3 FROM CC GROUP BY field3 ; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE CC index NULL col_varchar_key 9 NULL 20 * MyISAM: EXPLAIN SELECT MAX( `col_time_key` ) , `col_varchar_key` field3 FROM CC GROUP BY field3 ; id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE CC ALL NULL NULL NULL NULL 20 Using temporary; Using filesort
[26 Jan 2010 12:26]
Jørgen Løland
Simplified test case: --source include/have_innodb.inc CREATE TABLE t1 ( col_time time, i int, INDEX (i) ) engine=innodb; INSERT INTO t1 VALUES ('01:27:35',1); # (1) INSERT INTO t1 VALUES ('06:11:01',1); INSERT INTO t1 VALUES ('19:53:05',2); # (2) INSERT INTO t1 VALUES ('21:44:25',2); INSERT INTO t1 VALUES ('10:55:12',3); # (3) INSERT INTO t1 VALUES ('05:45:11',3); INSERT INTO t1 VALUES ('00:25:00',4); INSERT INTO t1 VALUES ('19:53:03',5); INSERT INTO t1 VALUES ('06:01:40',5); INSERT INTO t1 VALUES ('01:27:36',5); # (4) SELECT MAX(col_time), i FROM t1 GROUP BY i; EXPLAIN SELECT MAX(col_time), i FROM t1 GROUP BY i; DROP TABLE t1; # (1) If this record is removed, the entire result set becomes # correct # (2) Reordering the two records with i=2 produces wrong result # (3) Reordering the two records with i=3 produces correct result # (4) Note that time is 1 second after (1). If time is 1 second # earlier, 06:01:40 (which is also wrong) is shown instead.
[26 Jan 2010 12:28]
Jørgen Løland
MyISAM query output: -------------------- SELECT MAX(col_time), i FROM t1 GROUP BY i; MAX(col_time) i 06:11:01 1 21:44:25 2 10:55:12 3 00:25:00 4 19:53:03 5 InnoDB query output (diff vs MyISAM): ------------------------------------- MAX(col_time) i 06:11:01 1 21:44:25 2 -10:55:12 3 +05:45:11 3 00:25:00 4 -19:53:03 5 +01:27:36 5
[4 Feb 2010 14:52]
Jørgen Løland
Also fails with datetime: CREATE TABLE t1 ( col_time datetime, i int, INDEX (i) ) engine=innodb; INSERT INTO t1 VALUES ('2009-11-11 01:27:35',1); INSERT INTO t1 VALUES ('2009-11-11 06:11:01',1); INSERT INTO t1 VALUES ('2009-11-11 19:53:05',2); INSERT INTO t1 VALUES ('2009-11-11 21:44:25',2); INSERT INTO t1 VALUES ('2009-11-11 10:55:12',3); INSERT INTO t1 VALUES ('2009-11-11 05:45:11',3); INSERT INTO t1 VALUES ('2009-11-11 00:25:00',4); INSERT INTO t1 VALUES ('2009-11-11 19:53:03',5); INSERT INTO t1 VALUES ('2009-11-11 06:01:40',5); INSERT INTO t1 VALUES ('2009-11-11 01:27:36',5); SELECT MAX(col_time), i FROM t1 GROUP BY i; MAX(col_time) i 2009-11-11 06:11:01 1 2009-11-11 21:44:25 2 2009-11-11 05:45:11 3 2009-11-11 00:25:00 4 2009-11-11 01:27:36 5
[4 Feb 2010 15:36]
Jørgen Løland
The problem is this: * Item_sum_hybrid (which Item_sum_max inherits) has two items: the Item representing the column to aggregate over, and an Item_cache_datetime to store the max value (or min value, in case of MIN() aggregate function, which also fails BTW) * Item_sum_hybrid points to a comparison function, which is Arg_comparator::compare_datetime * The Arg_comparator, created and set up by Item_sum_hybrid in setup(), takes two important arguments: the Item and the Item_cache_datetime of Item_sum_hybrid. These are stored as Item *a and Item *b, respectively. * end_send_group() calls Item_sum_max::add(), which evaluates the column value of the current row, and which updates the Item_cache_datetime if the current value is higher than the cached value. * Item_sum_max::add() calls the comparison function compare_datetime(), which calls get_time_value(). get_time_value() creates an Item_cache_int, used to *overwrites the b pointer of Arg_comparator*. However, the Item_cache_datetime of the Item_sum_hybrid remains unchanged. Now we have two caches for the same Item. * In Item_sum_max::add, if cmp->compare()>0 ("this row has ha higher value than the cached one"), the cache item of Item_sum_hybrid is updated with then new highest value. However, the cache pointed to by the b pointer in Arg_comparator is not updated since this is a different object. Thus, the cached value in b will never change; it will contain the value from the first row evaluated throughout query execution (in this case, 01:27:35).
[5 Feb 2010 9:57]
Jørgen Løland
This problem was introduced when BUG#43668 was merged from 5.1 into 6.0. In 5.1, the Item_sum_max object caches the highest value in an Item_cache_string (Item_cache_datetime in 6.0). Further, in 5.1, the comparison function chosen is compare_binary_string(). This comparison function does not create it's own cache item that overwrites b (in contrast to compare_datetime which is used in 6.0).
[5 Feb 2010 16:31]
Jørgen Løland
Depends on BUG#49771.
[8 Feb 2010 7:55]
Jørgen Løland
Suggested solution: There's no point in creating a cache on top of a cache, so get_time_value() and get_datetime_value() should not create the Item_cache_int on top of the Item_cache_datetime in this case. The trick of setting the Item_cache_datetime::used_table_map to 1 (makes const_item() return false) can be used.
[8 Feb 2010 11:33]
Evgeny Potemkin
A duplicate of the bug#49771.