Bug #16710 select distinct doesn't return all it should
Submitted: 22 Jan 2006 16:35 Modified: 6 Apr 2006 13:24
Reporter: John Sinteur Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.13-5.0.17 OS:FreeBSD (FreeBSD 5.4-6.0)
Assigned to: Timour Katchaounov CPU Architecture:Any

[22 Jan 2006 16:35] John Sinteur
Description:
Compare the results to these two queries:

mysql> SELECT distinct(year(timestamp)),klant FROM logging WHERE YEAR(timestamp) = 2005;
+-------------------+-----------------+
| (year(timestamp)) | klant           |
+-------------------+-----------------+
|              2005 |                 |
|              2005 | dokterdokter.nl |
|              2005 | thevbros        |
+-------------------+-----------------+
3 rows in set (0.30 sec)

mysql> SELECT distinct(klant) FROM logging WHERE YEAR(timestamp) = 2005;
+----------+
| klant    |
+----------+
|          |
| thevbros |
+----------+
2 rows in set (0.01 sec)                                                                                                                                                     

The second query should also list "dokterdokter.nl", yet it doesn't. Works fine in version 4.1.8, was discovered when testing after upgrade.

First query is the workaround for the second query.

that's in this table:

CREATE TABLE `logging` (
  `id` int(11) NOT NULL auto_increment,
  `klant` varchar(100) default NULL,
  `code` int(11) default NULL,
  `paragraph` varchar(30) default NULL,
  `verzekeraar` varchar(100) default NULL,
  `session_id` varchar(100) default NULL,
  `timestamp` datetime default NULL,
  `ip_address_request` varchar(20) default NULL,
  PRIMARY KEY  (`id`),
  KEY `klant` (`klant`,`timestamp`),
  KEY `klant_2` (`klant`,`verzekeraar`,`timestamp`),
  KEY `session_id` (`session_id`,`timestamp`) 
) ENGINE=MyISAM DEFAULT CHARSET=latin1;           

                           

How to repeat:
[22 Jan 2006 16:50] John Sinteur
just tested in 5.0.18 - also fails
[23 Jan 2006 1:52] Hartmut Holzgraefe
Works fine on 4.1, fails for me on rather recent 5.0bk source.

This also fails to produce the "dokterdokter.nl" row:

  SELECT klant FROM logging WHERE YEAR(timestamp) = 2005 GROUP BY klant;

when checking the output of 

  SELECT klant FROM logging WHERE YEAR(timestamp) = 2005;

i can see lots of "dokterdokter.nl" result rows.
[16 Feb 2006 8:13] Timour Katchaounov
The problem with this query is that the optimizer chooses loose index scan,
while it shouldn't, because the where clause should be applied before the
group by (or distinct) clause.
[17 Feb 2006 14:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/2795
[30 Mar 2006 20:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/4343
[31 Mar 2006 9:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/4357
[31 Mar 2006 19:51] Timour Katchaounov
Fixed in 5.0.21
[6 Apr 2006 13:24] Jon Stephens
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Documented bugfix in 5.0.21 changelog. Closed.