Bug #71097 | Wrong results for a simple query with GROUP BY | ||
---|---|---|---|
Submitted: | 5 Dec 2013 18:26 | Modified: | 6 Jun 2014 2:12 |
Reporter: | Valeriy Kravchuk | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
Version: | 5.1.70, 5.5.33, 5.5.35, 5.6.14, 5.6.15 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | GROUP BY |
[5 Dec 2013 18:26]
Valeriy Kravchuk
[5 Dec 2013 18:38]
Valeriy Kravchuk
From what I see, with 5.1.70 and MyISAM the result is the same - wrong: openxs@ao756:~/dbs/5.1$ bin/mysql --no-defaults -uroot test Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.1.70 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE TABLE ti ( `id` int(10) unsigned NOT NULL, `ns_id` int(10) unsigned NOT NULL DEFAULT '0', PRIMARY KEY (`ns_id`,`id`)) engine=InnoDB; Query OK, 0 rows affected, 2 warnings (0.08 sec) mysql> show warnings\G *************************** 1. row *************************** Level: Warning Code: 1286 Message: Unknown table engine 'InnoDB' *************************** 2. row *************************** Level: Warning Code: 1266 Message: Using storage engine MyISAM for table 'ti' 2 rows in set (0.00 sec) mysql> insert into ti values(1,1), (2,2), (1042000,501087), (5435626,504005); Query OK, 4 rows affected (0.00 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> select * from ti where (ns_id = 501087 and id = 1042000) or (ns_id = 504005 and id = 5435626); +---------+--------+ | id | ns_id | +---------+--------+ | 1042000 | 501087 | | 5435626 | 504005 | +---------+--------+ 2 rows in set (0.00 sec) mysql> select ns_id, max(id) from ti where (ns_id = 501087 and id = 1042000) or (ns_id = 504005 and id = 5435626) group by ns_id; +--------+---------+ | ns_id | max(id) | +--------+---------+ | 501087 | 1042000 | +--------+---------+ 1 row in set (0.00 sec) mysql> explain select ns_id, max(id) from ti where (ns_id = 501087 and id = 1042000) or (ns_id = 504005 and id = 5435626) group by ns_id; +----+-------------+-------+-------+---------------+---------+---------+------+------+---------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+------+------+---------------------------------------+ | 1 | SIMPLE | ti | range | PRIMARY | PRIMARY | 8 | NULL | 2 | Using where; Using index for group-by | +----+-------------+-------+-------+---------------+---------+---------+------+------+---------------------------------------+ 1 row in set (0.00 sec) So, does not looks like a recent regression at least.
[6 Dec 2013 10:57]
MySQL Verification Team
Hello Valeriy, Thank you for the bug report and test case. Verified as described. Thanks, Umesh
[6 Dec 2013 11:24]
Valeriy Kravchuk
Corrected typo in Synopsis.
[6 Jun 2014 2:12]
Paul DuBois
Noted in 5.5.39, 5.6.20, 5.7.5 changelogs. If there was a predicate on a column referenced by MIN() or MAX() and that predicate was not present in all the disjunctions on key parts earlier in the compound index, Loose Index Scan returned an incorrect result.
[1 Aug 2014 15:55]
Laurynas Biveinis
5.5 $ bzr log -r 4635 ------------------------------------------------------------ revno: 4635 committer: Chaithra Gopalareddy <chaithra.gopalareddy@oracle.com> branch nick: mysql-5.5 timestamp: Wed 2014-05-07 14:59:23 +0530 message: Bug#17909656 - WRONG RESULTS FOR A SIMPLE QUERY WITH GROUP BY Problem: If there is a predicate on a column referenced by MIN/MAX and that predicate is not present in all the disjunctions on keyparts earlier in the compound index, Loose Index Scan will not return correct result. Analysis: When loose index scan is chosen, range optimizer currently groups all the predicates that contain group parts separately and minmax parts separately. It therefore applies all the conditions on the group parts first to the fetched row. Then in the call to next_max, it processes the conditions which have min/max keypart. For ex in the following query: Select f1, max(f2) from t1 where (f1 = 10 and f2 = 13) or (f1 = 3) group by f1; Condition (f2 = 13) would be applied even for rows that satisfy (f1 = 3) thereby giving wrong results. Solution: Do not choose loose_index_scan for such cases. So a new rule WA2 is introduced to take care of the same. WA2: "If there are predicates on C, these predicates must be in conjuction to all predicates on all earlier keyparts in I." Todo the same, fix reuses the function get_constant_key_infix(). Since this funciton will fail for all multi-range conditions, it is re-written to recognize that if the sub-conditions are equivalent across the disjuncts: it will now succeed. And to achieve this a new helper function is introduced called all_same(). The fix also moves the test of NGA3 up to the former only caller, get_constant_key_infix().
[1 Aug 2014 15:55]
Laurynas Biveinis
5.5 laurynas$ bzr log -r 4636 ------------------------------------------------------------ revno: 4636 committer: Chaithra Gopalareddy <chaithra.gopalareddy@oracle.com> branch nick: mysql-5.5 timestamp: Wed 2014-05-07 16:55:03 +0530 message: Fixing compilation error. Post push fix for Bug#17909656