Bug #42236 ORDER BY and/or GROUP BY behaves badly together with views.
Submitted: 21 Jan 2009 11:02 Modified: 23 Jan 2009 9:18
Reporter: Lars Borg Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Views Severity:S3 (Non-critical)
Version:5.1.30-community OS:Microsoft Windows
Assigned to: CPU Architecture:Any

[21 Jan 2009 11:02] Lars Borg
Description:
This behavior do not exist in version 5.0.24a-community-nt

Query:
select
	jf.ID AS ID,
	jf.fileName AS fileName,
	jv.version AS version,
	jv.active AS active,
	ju.userName AS userName
from 
	jarfiledb.jarfile jf
join
	jarfiledb.jarfile_has_jargroup jfhjg on jf.ID = jfhjg.jarFile_ID
join
	jarfiledb.jargroup jg on jg.ID = jfhjg.jarGroup_ID
join
	jarfiledb.jaruser ju on jg.ID = ju.jarGroup_ID
left join
	jarfiledb.jarversion jv on jf.ID = jv.jarFile_ID 
order by
	jv.active desc,
	jv.version desc

Returns:
ID	fileName	version	active	userName
1	CoreSource	1.1.4	1	labo10
1	CoreSource	1.1.4	1	pehy01
2	CoreTarget	1.1.3	1	pehy01
3	CoreHandler	1.1.3	1	pehy01
2	CoreTarget	1.1.3	1	labo10
3	CoreHandler	1.1.3	1	labo10
4	EnvDB	1.0.0	1	pehy01
4	EnvDB	1.0.0	1	labo10
1	CoreSource	1.1.3	0	labo10
1	CoreSource	1.1.3	0	pehy01
3	CoreHandler	1.1.2	0	pehy01
2	CoreTarget	1.1.2	0	labo10
3	CoreHandler	1.1.2	0	labo10
1	CoreSource	1.1.2	0	pehy01
1	CoreSource	1.1.2	0	labo10
2	CoreTarget	1.1.2	0	pehy01

Insert this query into a view and it gives the same result.
However... If this view then is used in another SQL-query instead of using this query as a subquery. Things are going wrong.

Complete query with view:
select
	ID,
	fileName,
	version,
	active,
	userName
from
	theView
group by
	ID,
	userName
order by
	fileName

Gives:
ID	fileName	version active userName
3	CoreHandler	1.1.2	0	labo10
3	CoreHandler	1.1.2	0	pehy01
1	CoreSource	1.1.2	0	pehy01
1	CoreSource	1.1.2	0	labo10
2	CoreTarget	1.1.2	0	labo10
2	CoreTarget	1.1.2	0	pehy01
4	EnvDB	1.0.0	1	pehy01
4	EnvDB	1.0.0	1	labo10

Here you can see that ID:1, fileName:CoreSource gives version 1.1.2 and active 0, when it should give 1.1.4 and active 1,
since the sub-qurey/view does ORDER BY on active and version.

Complete query with sub-query:
select
	ID,
	fileName,
	version,
	active,
	userName
from (
	select
		jf.ID AS ID,
		jf.fileName AS fileName,
		jv.version AS version,
		jv.active AS active,
		ju.userName AS userName
	from 
		jarfiledb.jarfile jf
	join
		jarfiledb.jarfile_has_jargroup jfhjg on jf.ID = jfhjg.jarFile_ID
	join
		jarfiledb.jargroup jg on jg.ID = jfhjg.jarGroup_ID
	join
		jarfiledb.jaruser ju on jg.ID = ju.jarGroup_ID
	left join
		jarfiledb.jarversion jv on jf.ID = jv.jarFile_ID 
	order by
		jv.active desc,
		jv.version desc) as t1
group by
	ID,
	userName
order by
	fileName

Gives:
ID	fileName	version active userName
3	CoreHandler	1.1.3	1	pehy01
3	CoreHandler	1.1.3	1	labo10
1	CoreSource	1.1.4	1	pehy01
1	CoreSource	1.1.4	1	labo10
2	CoreTarget	1.1.3	1	pehy01
2	CoreTarget	1.1.3	1	labo10
4	EnvDB	1.0.0	1	pehy01
4	EnvDB	1.0.0	1	labo10

This is the correct result.

How to repeat:
Create the database using the provided spec (in the private comment).
Try running jarFileOverview and you'll see the problem.
Try running this query (below) and you'll see the correct response.

select
	ID,
	fileName,
	version,
	active,
	userName
from (
	select
		jf.ID AS ID,
		jf.fileName AS fileName,
		jv.version AS version,
		jv.active AS active,
		ju.userName AS userName
	from 
		jarfiledb.jarfile jf
	join
		jarfiledb.jarfile_has_jargroup jfhjg on jf.ID = jfhjg.jarFile_ID
	join
		jarfiledb.jargroup jg on jg.ID = jfhjg.jarGroup_ID
	join
		jarfiledb.jaruser ju on jg.ID = ju.jarGroup_ID
	left join
		jarfiledb.jarversion jv on jf.ID = jv.jarFile_ID 
	order by
		jv.active desc,
		jv.version desc) as t1
group by
	ID,
	userName
order by
	fileName
[21 Jan 2009 12:34] MySQL Verification Team
Thank you for the bug report. Could you please print here the output of 5.1.24 and 5.1.30 of both queries with view and without view. Thanks in advance.
[21 Jan 2009 19:38] Lars Borg
Just a clarification...
Did you really mean 5.1.24? I hope you meant 5.0.24.

I'll be back at work in 12 hrs. I'll give you the outputs then for 5.0.24 and 5.1.30.
[21 Jan 2009 21:53] MySQL Verification Team
Thank you for the feedback. Sorry I meant the older version you said has worked and then it is 5.0.24. Thanks in advance.
[22 Jan 2009 8:09] Lars Borg
Here are the results as you requested.

MySQL version: 5.1.30
mysql> select * from jarfileoverview;
+----+-------------+---------+--------+----------+
| ID | fileName    | version | active | userName |
+----+-------------+---------+--------+----------+
|  3 | CoreHandler | 1.1.2   |      0 | pehy01   |
|  3 | CoreHandler | 1.1.2   |      0 | labo10   |
|  1 | CoreSource  | 1.1.2   |      0 | pehy01   |
|  1 | CoreSource  | 1.1.2   |      0 | labo10   |
|  2 | CoreTarget  | 1.1.2   |      0 | labo10   |
|  2 | CoreTarget  | 1.1.2   |      0 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | labo10   |
+----+-------------+---------+--------+----------+

mysql> select
    ->  ID,
    ->  fileName,
    ->  version,
    ->  active,
    ->  userName
    -> from (
    ->  select
    ->          jf.ID AS ID,
    ->          jf.fileName AS fileName,
    ->          jv.version AS version,
    ->          jv.active AS active,
    ->          ju.userName AS userName
    ->  from
    ->          jarfile jf
    ->  join
    ->          jarfile_has_jargroup jfhjg on jf.ID = jfhjg.jarFile_ID
    ->  join
    ->          jargroup jg on jg.ID = jfhjg.jarGroup_ID
    ->  join
    ->          jaruser ju on jg.ID = ju.jarGroup_ID
    ->  left join
    ->          jarversion jv on jf.ID = jv.jarFile_ID
    ->  order by
    ->          jv.active desc,
    ->          jv.version desc) as t1
    -> group by
    ->  ID,
    ->  userName
    -> order by
    ->  fileName;
+----+-------------+---------+--------+----------+
| ID | fileName    | version | active | userName |
+----+-------------+---------+--------+----------+
|  3 | CoreHandler | 1.1.3   |      1 | pehy01   |
|  3 | CoreHandler | 1.1.3   |      1 | labo10   |
|  1 | CoreSource  | 1.1.4   |      1 | pehy01   |
|  1 | CoreSource  | 1.1.4   |      1 | labo10   |
|  2 | CoreTarget  | 1.1.3   |      1 | pehy01   |
|  2 | CoreTarget  | 1.1.3   |      1 | labo10   |
|  4 | EnvDB       | 1.0.0   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | labo10   |
+----+-------------+---------+--------+----------+

MySQL version: 5.0.24
mysql> select * from jarfileoverview;
+----+-------------+---------+--------+----------+
| ID | fileName    | version | active | userName |
+----+-------------+---------+--------+----------+
|  3 | CoreHandler | 1.1.3   |      1 | pehy01   |
|  3 | CoreHandler | 1.1.3   |      1 | labo10   |
|  1 | CoreSource  | 1.1.4   |      1 | labo10   |
|  1 | CoreSource  | 1.1.4   |      1 | pehy01   |
|  2 | CoreTarget  | 1.1.3   |      1 | labo10   |
|  2 | CoreTarget  | 1.1.3   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | labo10   |
+----+-------------+---------+--------+----------+

mysql> select
    ->  ID,
    ->  fileName,
    ->  version,
    ->  active,
    ->  userName
    -> from (
    ->  select
    ->          jf.ID AS ID,
    ->          jf.fileName AS fileName,
    ->          jv.version AS version,
    ->          jv.active AS active,
    ->          ju.userName AS userName
    ->  from
    ->          jarfile jf
    ->  join
    ->          jarfile_has_jargroup jfhjg on jf.ID = jfhjg.jarFile_ID
    ->  join
    ->          jargroup jg on jg.ID = jfhjg.jarGroup_ID
    ->  join
    ->          jaruser ju on jg.ID = ju.jarGroup_ID
    ->  left join
    ->          jarversion jv on jf.ID = jv.jarFile_ID
    ->  order by
    ->          jv.active desc,
    ->          jv.version desc) as t1
    -> group by
    ->  ID,
    ->  userName
    -> order by
    ->  fileName;
+----+-------------+---------+--------+----------+
| ID | fileName    | version | active | userName |
+----+-------------+---------+--------+----------+
|  3 | CoreHandler | 1.1.3   |      1 | pehy01   |
|  3 | CoreHandler | 1.1.3   |      1 | labo10   |
|  1 | CoreSource  | 1.1.4   |      1 | labo10   |
|  1 | CoreSource  | 1.1.4   |      1 | pehy01   |
|  2 | CoreTarget  | 1.1.3   |      1 | labo10   |
|  2 | CoreTarget  | 1.1.3   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | labo10   |
+----+-------------+---------+--------+----------+
[22 Jan 2009 22:35] Sveta Smirnova
Thank you for the feedback.

This is expected behavior as you have "group by ID, userName" while "order by fileName".

I'll quote explanation from bug #38061 here.

http://dev.mysql.com/doc/refman/5.0/en/select.html

"If you use GROUP BY, output rows are sorted according to the GROUP BY columns as if you
had an ORDER BY for the same columns."

"If ORDER BY occurs within a subquery and also is applied in the outer query, the
outermost ORDER BY takes precedence. "

The above statements mean that you are sorting by subkey, which has duplicates. In case of
two rows having the same value in the sorting column, the returned row is not guaranteed
to be in any specific order.

So this is not a bug.
[22 Jan 2009 22:35] Sveta Smirnova
Please see https://bugs.launchpad.net/mysql-server/+bug/247727 also for additional explanations.
[23 Jan 2009 8:21] Lars Borg
I don't have a huge experience with databases, but I do know that it is not allowed to make an ORDER BY before you have a GROUP BY according to the SQL-spec. If GROUP BY internally first reorders the rows before actually doing the "grouping", it is not possible to trust the output of the query, since you can't trust the input.

If I give a query a "resultset", that resultset should ONLY be changed/modified according to what is specified by the query. Obviously this is not the case with 5.1.30, since it reorders the resultset first thing without the query telling it to do so, it is only told to group, not to sort.

In my opinion THIS IS SERIOUS BUG that prevents simple "business logic" within a query. But it's your database and you do what you like with it.
[23 Jan 2009 8:38] Lars Borg
Wait a second!

Why do I get DIFFERENT results when using a sub-query instead of a view, when the view spec is identical to the sub-query?

That question can't be explained by the talk of GROUP BY is doing an ORDER BY first.

This is inconsequent behavior.
[23 Jan 2009 8:57] Sveta Smirnova
Thank you for the feedback.

> Why do I get DIFFERENT results when using a sub-query instead of a view, when the view
spec is identical to the sub-query?

There is not guaranteed order which optimizer should take in this case, this is why you get different results if use subquery and views.
[23 Jan 2009 9:18] Lars Borg
No, no, no, no, no.... This does not make sense!

If the sub-query don't have a guaranteed order... How on earth can you trust any SQL at all that contains a sub-query? It's just not possible.

So, if the sub-query has a guaranteed order, then what you said about GROUP BY is not correct, because the query does what I tell it to do (no sorting in GROUP BY), as I have shown in the initial report.