Bug #12934 Select without `where` does not show row that same select with `where` returns
Submitted: 1 Sep 2005 17:42 Modified: 8 Sep 2005 23:16
Reporter: Oleg Marchuk Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:4.1.11-standard, 4.1.14-nt OS:Linux (Linux, Windows)
Assigned to: Sergey Petrunya CPU Architecture:Any

[1 Sep 2005 17:42] Oleg Marchuk
Description:
Last rows of full query (ordered by col_0_0_):
+----------+
| col_0_0_ |
+----------+
|     1802 |
|     1803 |
|     1813 |
|     1814 |
+----------+

Result of query with `where`:
+----------+
| col_0_0_ |
+----------+
|     1853 |
+----------+

How to repeat:
Use DB dump and queries from attached test case.
[1 Sep 2005 17:51] Oleg Marchuk
Actual results:

mysql> \. export.sql
+----------+
| col_0_0_ |
+----------+
|      107 |
|      389 |
|     1392 |
|     1393 |
|     1555 |
|     1740 |
|     1790 |
|     1798 |
|     1799 |
|     1813 |
+----------+
10 rows in set (0.03 sec)

+----------+
| col_0_0_ |
+----------+
|     1853 |
+----------+
1 row in set (0.00 sec)
[2 Sep 2005 12:45] Valeriy Kravchuk
Verified on 4.1.14 with provided test data:

mysql>  select distinct commercial0_.id as col_0_0_
    ->  from commercials commercial0_
    ->  inner join
    ->  appendixes appendixes1_ on commercial0_.id=appendixes1_.commercial
    ->  inner join
    ->  schedule schedule2_ on appendixes1_.id=schedule2_.appendix
    ->  inner join
    ->  periods period3_ on schedule2_.period=period3_.id
    ->  group by period3_.id , schedule2_.theDay
    ->  order by commercial0_.id, period3_.id, schedule2_.theDay ;
+----------+
| col_0_0_ |
+----------+
|      107 |
|      389 |
|     1392 |
|     1393 |
|     1555 |
|     1740 |
|     1790 |
|     1798 |
|     1799 |
|     1813 |
+----------+
10 rows in set (0.30 sec)

mysql>  select distinct commercial0_.id as col_0_0_
    ->  from commercials commercial0_
    ->  inner join
    ->  appendixes appendixes1_ on commercial0_.id=appendixes1_.commercial
    ->  inner join
    ->  schedule schedule2_ on appendixes1_.id=schedule2_.appendix
    ->  inner join
    ->  periods period3_ on schedule2_.period=period3_.id
    ->  where  appendixes1_.id = 1928
    ->  group by period3_.id , schedule2_.theDay
    ->  order by commercial0_.id, period3_.id, schedule2_.theDay ;
+----------+
| col_0_0_ |
+----------+
|     1853 |
+----------+
1 row in set (0.01 sec)

This may help also:

mysql> explain  select distinct commercial0_.id as col_0_0_
    ->  from commercials commercial0_
    ->  inner join
    ->  appendixes appendixes1_ on commercial0_.id=appendixes1_.commercial
    ->  inner join
    ->  schedule schedule2_ on appendixes1_.id=schedule2_.appendix
    ->  inner join
    ->  periods period3_ on schedule2_.period=period3_.id
    ->  group by period3_.id , schedule2_.theDay
    ->  order by commercial0_.id, period3_.id, schedule2_.theDay ;
+----+-------------+--------------+--------+------------------------------------
------------+--------------------+---------+------------------------------+-----
-+----------------------------------------------+
| id | select_type | table        | type   | possible_keys
            | key                | key_len | ref                          | rows
 | Extra                                        |
+----+-------------+--------------+--------+------------------------------------
------------+--------------------+---------+------------------------------+-----
-+----------------------------------------------+
|  1 | SIMPLE      | period3_     | index  | PRIMARY
            | PRIMARY            |       4 | NULL                         |   73
 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | schedule2_   | ref    | appendix,FKD6669297ACA91414,FKD6669
297F8E53C84 | FKD6669297F8E53C84 |       5 | test.period3_.id             |   45
 | Using where                                  |
|  1 | SIMPLE      | appendixes1_ | eq_ref | PRIMARY,FK80853C37913B7696
            | PRIMARY            |       4 | test.schedule2_.appendix     |    1
 |                                              |
|  1 | SIMPLE      | commercial0_ | eq_ref | PRIMARY
            | PRIMARY            |       4 | test.appendixes1_.commercial |    1
 | Using index                                  |
+----+-------------+--------------+--------+------------------------------------
------------+--------------------+---------+------------------------------+-----
-+----------------------------------------------+
4 rows in set (0.02 sec)

mysql> explain  select distinct commercial0_.id as col_0_0_
    ->  from commercials commercial0_
    ->  inner join
    ->  appendixes appendixes1_ on commercial0_.id=appendixes1_.commercial
    ->  inner join
    ->  schedule schedule2_ on appendixes1_.id=schedule2_.appendix
    ->  inner join
    ->  periods period3_ on schedule2_.period=period3_.id
    ->  where  appendixes1_.id = 1928
    ->  group by period3_.id , schedule2_.theDay
    ->  order by commercial0_.id, period3_.id, schedule2_.theDay ;
+----+-------------+--------------+--------+------------------------------------
------------+----------+---------+------------------------+------+--------------
----------------------+
| id | select_type | table        | type   | possible_keys
            | key      | key_len | ref                    | rows | Extra
                      |
+----+-------------+--------------+--------+------------------------------------
------------+----------+---------+------------------------+------+--------------
----------------------+
|  1 | SIMPLE      | appendixes1_ | const  | PRIMARY,FK80853C37913B7696
            | PRIMARY  |       4 | const                  |    1 | Using tempora
ry; Using filesort    |
|  1 | SIMPLE      | commercial0_ | const  | PRIMARY
            | PRIMARY  |       4 | const                  |    1 | Using index
                      |
|  1 | SIMPLE      | schedule2_   | ref    | appendix,FKD6669297ACA91414,FKD6669
297F8E53C84 | appendix |       5 | const                  |   24 | Using where;
Using index; Distinct |
|  1 | SIMPLE      | period3_     | eq_ref | PRIMARY
            | PRIMARY  |       4 | test.schedule2_.period |    1 | Using index;
Distinct              |
+----+-------------+--------------+--------+------------------------------------
------------+----------+---------+------------------------+------+--------------
----------------------+
4 rows in set (0.00 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 4.1.14-nt |
+-----------+
1 row in set (0.01 sec)
[8 Sep 2005 23:16] Sergey Petrunya
Looking at the queries 
select distinct commercial0_.id as col_0_0_ 
 from commercials commercial0_ 
 inner join 
 appendixes appendixes1_ on commercial0_.id=appendixes1_.commercial 
 inner join 
 schedule schedule2_ on appendixes1_.id=schedule2_.appendix 
 inner join 
 periods period3_ on schedule2_.period=period3_.id 
 group by period3_.id , schedule2_.theDay 
 order by commercial0_.id, period3_.id, schedule2_.theDay ; -- (1)

and 

 select distinct commercial0_.id as col_0_0_ 
 from commercials commercial0_ 
 inner join 
 appendixes appendixes1_ on commercial0_.id=appendixes1_.commercial 
 inner join 
 schedule schedule2_ on appendixes1_.id=schedule2_.appendix 
 inner join 
 periods period3_ on schedule2_.period=period3_.id 
 where  appendixes1_.id = 1928 
 group by period3_.id , schedule2_.theDay 
 order by commercial0_.id, period3_.id, schedule2_.theDay ; --(2)

it seems that it is ok that (2) returns values that (1) doesnt.

The value of commercial0_.id is not part of GROUP BY clause, so for each
{period3_.id , schedule2_.theDay} value group we get *any* value of
commercial0_.id within that group (this is documented, see 
http://dev.mysql.com/doc/mysql/en/group-by-hidden-fields.html)

Running this query

select count( distinct commercial0_.id) from commercials commercial0_ 
 inner join 
 appendixes appendixes1_ on commercial0_.id=appendixes1_.commercial 
 inner join 
 schedule schedule2_ on appendixes1_.id=schedule2_.appendix 
 inner join 
 periods period3_ on schedule2_.period=period3_.id 
 group by period3_.id , schedule2_.theDay ;

one sees that almost every {period3_.id, schedule2_.theDay} value group has 
several different values of commercial0_.id. 

When one adds 'where  appendixes1_.id = 1928' clause all value groups have 
commercial0_.id=1853 for all rows, and so the query (2) returns 1853 after 
DISTINCT has  been applied.

Without the 'where  appendixes1_.id = 1928' every single value group that
has a row with commercial0_.id=1853 also has rows with other values of 
commercial0_.id. It happens so that rows with other values of commercial0_.id
get "choosen" (see the above mentioned URL), and so the query (1) doesnt
return 1853.

Considering the above, the query output seems to be correct. Oleg, feel free to re-open the bug if there are any other reasons why it is not.