Bug #42236 ORDER BY and/or GROUP BY behaves badly together with views.
Submitted: 21 Jan 2009 12:02 Modified: 23 Jan 2009 10:18
Reporter: Lars Borg
Status: Not a Bug
Category:Server: Views Severity:S3 (Non-critical)
Version:5.1.30-community OS:Microsoft Windows
Assigned to: Target Version:

[21 Jan 2009 12:02] Lars Borg
Description:
This behavior do not exist in version 5.0.24a-community-nt

Query:
select
	jf.ID AS ID,
	jf.fileName AS fileName,
	jv.version AS version,
	jv.active AS active,
	ju.userName AS userName
from 
	jarfiledb.jarfile jf
join
	jarfiledb.jarfile_has_jargroup jfhjg on jf.ID = jfhjg.jarFile_ID
join
	jarfiledb.jargroup jg on jg.ID = jfhjg.jarGroup_ID
join
	jarfiledb.jaruser ju on jg.ID = ju.jarGroup_ID
left join
	jarfiledb.jarversion jv on jf.ID = jv.jarFile_ID 
order by
	jv.active desc,
	jv.version desc

Returns:
ID	fileName	version	active	userName
1	CoreSource	1.1.4	1	labo10
1	CoreSource	1.1.4	1	pehy01
2	CoreTarget	1.1.3	1	pehy01
3	CoreHandler	1.1.3	1	pehy01
2	CoreTarget	1.1.3	1	labo10
3	CoreHandler	1.1.3	1	labo10
4	EnvDB	1.0.0	1	pehy01
4	EnvDB	1.0.0	1	labo10
1	CoreSource	1.1.3	0	labo10
1	CoreSource	1.1.3	0	pehy01
3	CoreHandler	1.1.2	0	pehy01
2	CoreTarget	1.1.2	0	labo10
3	CoreHandler	1.1.2	0	labo10
1	CoreSource	1.1.2	0	pehy01
1	CoreSource	1.1.2	0	labo10
2	CoreTarget	1.1.2	0	pehy01

Insert this query into a view and it gives the same result.
However... If this view then is used in another SQL-query instead of using this query as
a subquery. Things are going wrong.

Complete query with view:
select
	ID,
	fileName,
	version,
	active,
	userName
from
	theView
group by
	ID,
	userName
order by
	fileName

Gives:
ID	fileName	version active userName
3	CoreHandler	1.1.2	0	labo10
3	CoreHandler	1.1.2	0	pehy01
1	CoreSource	1.1.2	0	pehy01
1	CoreSource	1.1.2	0	labo10
2	CoreTarget	1.1.2	0	labo10
2	CoreTarget	1.1.2	0	pehy01
4	EnvDB	1.0.0	1	pehy01
4	EnvDB	1.0.0	1	labo10

Here you can see that ID:1, fileName:CoreSource gives version 1.1.2 and active 0, when it
should give 1.1.4 and active 1,
since the sub-qurey/view does ORDER BY on active and version.

Complete query with sub-query:
select
	ID,
	fileName,
	version,
	active,
	userName
from (
	select
		jf.ID AS ID,
		jf.fileName AS fileName,
		jv.version AS version,
		jv.active AS active,
		ju.userName AS userName
	from 
		jarfiledb.jarfile jf
	join
		jarfiledb.jarfile_has_jargroup jfhjg on jf.ID = jfhjg.jarFile_ID
	join
		jarfiledb.jargroup jg on jg.ID = jfhjg.jarGroup_ID
	join
		jarfiledb.jaruser ju on jg.ID = ju.jarGroup_ID
	left join
		jarfiledb.jarversion jv on jf.ID = jv.jarFile_ID 
	order by
		jv.active desc,
		jv.version desc) as t1
group by
	ID,
	userName
order by
	fileName

Gives:
ID	fileName	version active userName
3	CoreHandler	1.1.3	1	pehy01
3	CoreHandler	1.1.3	1	labo10
1	CoreSource	1.1.4	1	pehy01
1	CoreSource	1.1.4	1	labo10
2	CoreTarget	1.1.3	1	pehy01
2	CoreTarget	1.1.3	1	labo10
4	EnvDB	1.0.0	1	pehy01
4	EnvDB	1.0.0	1	labo10

This is the correct result.

How to repeat:
Create the database using the provided spec (in the private comment).
Try running jarFileOverview and you'll see the problem.
Try running this query (below) and you'll see the correct response.

select
	ID,
	fileName,
	version,
	active,
	userName
from (
	select
		jf.ID AS ID,
		jf.fileName AS fileName,
		jv.version AS version,
		jv.active AS active,
		ju.userName AS userName
	from 
		jarfiledb.jarfile jf
	join
		jarfiledb.jarfile_has_jargroup jfhjg on jf.ID = jfhjg.jarFile_ID
	join
		jarfiledb.jargroup jg on jg.ID = jfhjg.jarGroup_ID
	join
		jarfiledb.jaruser ju on jg.ID = ju.jarGroup_ID
	left join
		jarfiledb.jarversion jv on jf.ID = jv.jarFile_ID 
	order by
		jv.active desc,
		jv.version desc) as t1
group by
	ID,
	userName
order by
	fileName
[21 Jan 2009 13:34] Miguel Solorzano
Thank you for the bug report. Could you please print here the output of 5.1.24 and 5.1.30
of both queries with view and without view. Thanks in advance.
[21 Jan 2009 20:38] Lars Borg
Just a clarification...
Did you really mean 5.1.24? I hope you meant 5.0.24.

I'll be back at work in 12 hrs. I'll give you the outputs then for 5.0.24 and 5.1.30.
[21 Jan 2009 22:53] Miguel Solorzano
Thank you for the feedback. Sorry I meant the older version you said has worked and then
it is 5.0.24. Thanks in advance.
[22 Jan 2009 9:09] Lars Borg
Here are the results as you requested.

MySQL version: 5.1.30
mysql> select * from jarfileoverview;
+----+-------------+---------+--------+----------+
| ID | fileName    | version | active | userName |
+----+-------------+---------+--------+----------+
|  3 | CoreHandler | 1.1.2   |      0 | pehy01   |
|  3 | CoreHandler | 1.1.2   |      0 | labo10   |
|  1 | CoreSource  | 1.1.2   |      0 | pehy01   |
|  1 | CoreSource  | 1.1.2   |      0 | labo10   |
|  2 | CoreTarget  | 1.1.2   |      0 | labo10   |
|  2 | CoreTarget  | 1.1.2   |      0 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | labo10   |
+----+-------------+---------+--------+----------+

mysql> select
    ->  ID,
    ->  fileName,
    ->  version,
    ->  active,
    ->  userName
    -> from (
    ->  select
    ->          jf.ID AS ID,
    ->          jf.fileName AS fileName,
    ->          jv.version AS version,
    ->          jv.active AS active,
    ->          ju.userName AS userName
    ->  from
    ->          jarfile jf
    ->  join
    ->          jarfile_has_jargroup jfhjg on jf.ID = jfhjg.jarFile_ID
    ->  join
    ->          jargroup jg on jg.ID = jfhjg.jarGroup_ID
    ->  join
    ->          jaruser ju on jg.ID = ju.jarGroup_ID
    ->  left join
    ->          jarversion jv on jf.ID = jv.jarFile_ID
    ->  order by
    ->          jv.active desc,
    ->          jv.version desc) as t1
    -> group by
    ->  ID,
    ->  userName
    -> order by
    ->  fileName;
+----+-------------+---------+--------+----------+
| ID | fileName    | version | active | userName |
+----+-------------+---------+--------+----------+
|  3 | CoreHandler | 1.1.3   |      1 | pehy01   |
|  3 | CoreHandler | 1.1.3   |      1 | labo10   |
|  1 | CoreSource  | 1.1.4   |      1 | pehy01   |
|  1 | CoreSource  | 1.1.4   |      1 | labo10   |
|  2 | CoreTarget  | 1.1.3   |      1 | pehy01   |
|  2 | CoreTarget  | 1.1.3   |      1 | labo10   |
|  4 | EnvDB       | 1.0.0   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | labo10   |
+----+-------------+---------+--------+----------+

MySQL version: 5.0.24
mysql> select * from jarfileoverview;
+----+-------------+---------+--------+----------+
| ID | fileName    | version | active | userName |
+----+-------------+---------+--------+----------+
|  3 | CoreHandler | 1.1.3   |      1 | pehy01   |
|  3 | CoreHandler | 1.1.3   |      1 | labo10   |
|  1 | CoreSource  | 1.1.4   |      1 | labo10   |
|  1 | CoreSource  | 1.1.4   |      1 | pehy01   |
|  2 | CoreTarget  | 1.1.3   |      1 | labo10   |
|  2 | CoreTarget  | 1.1.3   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | labo10   |
+----+-------------+---------+--------+----------+

mysql> select
    ->  ID,
    ->  fileName,
    ->  version,
    ->  active,
    ->  userName
    -> from (
    ->  select
    ->          jf.ID AS ID,
    ->          jf.fileName AS fileName,
    ->          jv.version AS version,
    ->          jv.active AS active,
    ->          ju.userName AS userName
    ->  from
    ->          jarfile jf
    ->  join
    ->          jarfile_has_jargroup jfhjg on jf.ID = jfhjg.jarFile_ID
    ->  join
    ->          jargroup jg on jg.ID = jfhjg.jarGroup_ID
    ->  join
    ->          jaruser ju on jg.ID = ju.jarGroup_ID
    ->  left join
    ->          jarversion jv on jf.ID = jv.jarFile_ID
    ->  order by
    ->          jv.active desc,
    ->          jv.version desc) as t1
    -> group by
    ->  ID,
    ->  userName
    -> order by
    ->  fileName;
+----+-------------+---------+--------+----------+
| ID | fileName    | version | active | userName |
+----+-------------+---------+--------+----------+
|  3 | CoreHandler | 1.1.3   |      1 | pehy01   |
|  3 | CoreHandler | 1.1.3   |      1 | labo10   |
|  1 | CoreSource  | 1.1.4   |      1 | labo10   |
|  1 | CoreSource  | 1.1.4   |      1 | pehy01   |
|  2 | CoreTarget  | 1.1.3   |      1 | labo10   |
|  2 | CoreTarget  | 1.1.3   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | pehy01   |
|  4 | EnvDB       | 1.0.0   |      1 | labo10   |
+----+-------------+---------+--------+----------+
[22 Jan 2009 23:35] Sveta Smirnova
Thank you for the feedback.

This is expected behavior as you have "group by ID, userName" while "order by fileName".

I'll quote explanation from bug #38061 here.

http://dev.mysql.com/doc/refman/5.0/en/select.html

"If you use GROUP BY, output rows are sorted according to the GROUP BY columns as if you
had an ORDER BY for the same columns."

"If ORDER BY occurs within a subquery and also is applied in the outer query, the
outermost ORDER BY takes precedence. "

The above statements mean that you are sorting by subkey, which has duplicates. In case
of
two rows having the same value in the sorting column, the returned row is not guaranteed
to be in any specific order.

So this is not a bug.
[22 Jan 2009 23:35] Sveta Smirnova
Please see https://bugs.launchpad.net/mysql-server/+bug/247727 also for additional
explanations.
[23 Jan 2009 9:21] Lars Borg
I don't have a huge experience with databases, but I do know that it is not allowed to
make an ORDER BY before you have a GROUP BY according to the SQL-spec. If GROUP BY
internally first reorders the rows before actually doing the "grouping", it is not
possible to trust the output of the query, since you can't trust the input.

If I give a query a "resultset", that resultset should ONLY be changed/modified according
to what is specified by the query. Obviously this is not the case with 5.1.30, since it
reorders the resultset first thing without the query telling it to do so, it is only told
to group, not to sort.

In my opinion THIS IS SERIOUS BUG that prevents simple "business logic" within a query.
But it's your database and you do what you like with it.
[23 Jan 2009 9:38] Lars Borg
Wait a second!

Why do I get DIFFERENT results when using a sub-query instead of a view, when the view
spec is identical to the sub-query?

That question can't be explained by the talk of GROUP BY is doing an ORDER BY first.

This is inconsequent behavior.
[23 Jan 2009 9:57] Sveta Smirnova
Thank you for the feedback.

> Why do I get DIFFERENT results when using a sub-query instead of a view, when the view
spec is identical to the sub-query?

There is not guaranteed order which optimizer should take in this case, this is why you
get different results if use subquery and views.
[23 Jan 2009 10:18] Lars Borg
No, no, no, no, no.... This does not make sense!

If the sub-query don't have a guaranteed order... How on earth can you trust any SQL at
all that contains a sub-query? It's just not possible.

So, if the sub-query has a guaranteed order, then what you said about GROUP BY is not
correct, because the query does what I tell it to do (no sorting in GROUP BY), as I have
shown in the initial report.