Bug #7672 order by sometimes causes rows to be excluded from results
Submitted: 5 Jan 2005 0:23 Modified: 19 Oct 2005 14:27
Reporter: John McCaskey Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.0.23 OS:Linux (Linux, OS X)
Assigned to: Evgeny Potemkin CPU Architecture:Any

[5 Jan 2005 0:23] John McCaskey
Description:
If a select query is contained in (SELECT  ..) and then an order by follows not all values will be returned, but only DISTINCT values.  The query doesn't ask for only DISTINCT, and the order by should not imply DISTINCT.  Furthermore, in other instances the order by does still return all rows, not just the DISTINCT.

See the reproduction box for details.

How to repeat:
CREATE TABLE `event_log_5minute_20050104` (
  `server_id` tinyint(3) unsigned NOT NULL default '0',
  `monitor_id` mediumint(8) unsigned NOT NULL default '0',
  `monitor_server_id` tinyint(3) unsigned NOT NULL default '0',
  `timestamp` timestamp(14) NOT NULL,
  `avg` float default NULL,
  PRIMARY KEY  (`monitor_id`,`monitor_server_id`,`timestamp`,`server_id`)
) TYPE=InnoDB;

insert into event_log_5minute_20050104 VALUES (50, 479139, 1, 20050104211000, NULL), (50, 479139, 1, 20050104211500, NULL), (50, 479139, 1, 20050104212000, NULL), (50, 479139, 1, 20050104212500, 232.1), (50, 479139, 1, 20050104213000, 232.1);

(SELECT * FROM event_log_5minute_20050104 WHERE monitor_id=479139 AND monitor_server_id=1 AND timestamp >= 20050104080000 AND timestamp < 20050105080000);

Shows all rows:
+-----------+------------+-------------------+----------------+-------+
| server_id | monitor_id | monitor_server_id | timestamp      | avg   |
+-----------+------------+-------------------+----------------+-------+
|        50 |     479139 |                 1 | 20050104211000 |  NULL |
|        50 |     479139 |                 1 | 20050104211500 |  NULL |
|        50 |     479139 |                 1 | 20050104212000 |  NULL |
|        50 |     479139 |                 1 | 20050104212500 | 232.1 |
|        50 |     479139 |                 1 | 20050104213000 | 232.1 |
+-----------+------------+-------------------+----------------+-------+
5 rows in set (0.00 sec)

SELECT avg FROM event_log_5minute_20050104 WHERE monitor_id=479139 AND monitor_server_id=1 AND timestamp >= 20050104080000 AND timestamp < 20050105080000 order by avg;

Again, shows all rows:
+-------+
| avg   |
+-------+
|  NULL |
|  NULL |
|  NULL |
| 232.1 |
| 232.1 |
+-------+
5 rows in set (0.00 sec)

(SELECT avg FROM event_log_5minute_20050104 WHERE monitor_id=479139 AND monitor_server_id=1 AND timestamp >= 20050104080000 AND timestamp < 20050105080000) order by avg;

Shows only 2 rows:
+-------+
| avg   |
+-------+
|  NULL |
| 232.1 |
+-------+
2 rows in set (0.00 sec)

(SELECT avg FROM event_log_5minute_20050104 WHERE monitor_id=479139 AND monitor_server_id=1 AND timestamp >= 20050104080000 AND timestamp < 20050105080000) UNION ALL (SELECT NULL) order by avg;

Shows all rows, plus extra NULL:
+-------+
| avg   |
+-------+
|  NULL |
|  NULL |
|  NULL |
|  NULL |
| 232.1 |
| 232.1 |
+-------+
6 rows in set (0.00 sec)

Suggested fix:
Should return all values, not only distinct.
[7 Jan 2005 12:49] Martin Friebe
I thing this behaviour (or if changed the new behavior) should be documented on the UNION manpage. (read the cmments there)

because currently the example query is handled as a UNION (of 1 select), which means that is why it is threaded as distinct (default for union).
I do not know, if this should be handled as union, but there are sme usages for it

#from the comments:
(SELECT avg FROM event_log_5minute_20050104 WHERE monitor_id=479139 AND
monitor_server_id=1 AND timestamp >= 20050104080000 AND timestamp <
20050105080000 ORDER BY avg DESC LIMIT 5 ) order by avg
# to return the last 5 in ascending order
[7 Jan 2005 17:37] John McCaskey
You comment makes sense.  However I think that when defaulting to handle it as a union of one select it should be handled as a UNION ALL of one select as just adding (...) implying distinct is extremely counter intuitive.

Futhermore (I have not verified this myself) a friend tested this for me on 4.1 and found that the behavior was to return all rows, not just all distinct.  So, if the 4.1 behavior is infact to not imply DISTINCT then its already fixed there, and should be documented as such.
[25 Jan 2005 6:52] Igor Babaev
The problem can be reproduced by a simpler test:

mysql> create table t1 (a int);
Query OK, 0 rows affected (0.28 sec)

mysql> insert into t1 values (2), (1), (1), (2), (1);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> select a from t1 order by a;
+------+
| a    |
+------+
|    1 |
|    1 |
|    1 |
|    2 |
|    2 |
+------+
5 rows in set (0.00 sec)

mysql> (select a from t1) order by a;
+------+
| a    |
+------+
|    1 |
|    2 |
+------+
2 rows in set (0.00 sec)
[28 Jan 2005 3:45] Igor Babaev
ChangeSet
  1.2039 05/01/25 14:30:38 igor@rurik.mysql.com +3 -0
  order_by.result, order_by.test:
    Added a test case for bug #7672.
  sql_yacc.yy:
    Fixed bug #7672.
    Made queries of the form (SELECT ...) ORDER BY ... to
    be equivalent to SELECT ... ORDER BY ...

These queries returns the same result sets for 4.1, but I added the test to 4.1 as well.
[6 Mar 2005 23:47] Paul DuBois
Noted in 4.0.24 changelog.
[18 Aug 2005 15:56] Jeremy Cole
It turns out that this isn't a completely clean fix.  A query that worked on 4.0.23 no longer works in 4.0.24 because of the change from treating it as UNION to just SELECT.  I think the proper fix (if this syntax is allowed by the SQL standard) is to treat it as a UNION ALL on one table as was originally suggested.

In 4.0.23, the following parsed and worked well:

DROP TABLE IF EXISTS t;
CREATE TABLE t (a INT, b INT);
(SELECT a, b AS c FROM t GROUP BY a) ORDER BY c + 1;

In 4.0.24, it produces:

ERROR 1054: Unknown column 'c' in 'order clause'

Note that the significant parts of the query are: ORDER BY on an expression including an alias from the original SELECT.  Obviously this is a simplified example from the original problem query (which broke on a "minor" upgrade from 4.0.23 to 4.0.24).

The only solution is to use a UNION against no result (messy) or a temp table, to get the old result.

What does the SQL standard say about this case?
[10 Sep 2005 17:03] Sergei Golubchik
More examples:

works: (SELECT a, b AS c FROM t) union all (select 0,0 limit 0) ORDER BY c+1;
doesn't: (SELECT a, b AS c FROM t) ORDER BY c+1;
doesn't: SELECT a, b AS c FROM t ORDER BY c+1;

works: SELECT a, b AS c FROM t ORDER BY b+1;
works: (SELECT a, b AS c FROM t) ORDER BY b+1;
doesn't: (SELECT a, b AS c FROM t) union all (select 0,0 limit 0) ORDER BY b+1;
[27 Sep 2005 22:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/30407
[8 Oct 2005 19:47] Evgeny Potemkin
UNIONs in examples above work as they should. See http://dev.mysql.com/doc/mysql/en/union.html.
"This kind of ORDER BY cannot use column references that include a table name (that is, names in tbl_name.col_name format). Instead, provide a column alias in the first SELECT  statement and refer to the alias in the ORDER BY"
[9 Oct 2005 19:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/30841
[9 Oct 2005 19:22] Evgeny Potemkin
When fixing Item_func_plus in ORDER BY clause field c is searched in all
opened tables, but because c is an alias it wasn't found there.

Fixed in 4.0.27, cset 1.2154.1.1
[12 Oct 2005 20:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/30983
[12 Oct 2005 20:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/30985
[12 Oct 2005 21:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/30987
[12 Oct 2005 22:19] Evgeny Potemkin
Fixed in 4.1.16
[13 Oct 2005 13:54] Evgeny Potemkin
Fixed in 5.0.15
[19 Oct 2005 14:27] Jon Stephens
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Documented fix in 4.0.27, 4.1.16, and 5.0.15 changelogs.

Removed from 4.0.24 since it turned out that initial fix wasn't effective in correcting the problem.

Closed.