| Bug #54751 | session variables are evaluated differently in 5.1.24 and 5.1.41/45 in group by | ||
|---|---|---|---|
| Submitted: | 23 Jun 2010 18:56 | Modified: | 29 Jun 2010 19:13 |
| Reporter: | richard xin | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server: General | Severity: | S3 (Non-critical) |
| Version: | 5.1.45 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | GROUP BY, session variable | ||
[23 Jun 2010 18:56]
richard xin
[24 Jun 2010 6:47]
Valeriy Kravchuk
Formally this is not a bug. Our manual (http://dev.mysql.com/doc/refman/5.1/en/user-variables.html) clearly says: "The order of evaluation for expressions involving user variables is undefined and may change based on the elements contained within a given statement. In SELECT @a, @a:=@a+1, ..., you might think that MySQL will evaluate @a first and then do an assignment second. However, changing the statement (for example, by adding a GROUP BY, HAVING, or ORDER BY clause) may cause MySQL to select an execution plan with a different order of evaluation." Also, results may depend on indexes added, even in simpler cases: mysql> set @rank=0; Query OK, 0 rows affected (0.00 sec) mysql> select ID, sum(score), @rank:=@rank+1 from test1 group by ID order by sum(score) desc; +----+------------+----------------+ | ID | sum(score) | @rank:=@rank+1 | +----+------------+----------------+ | 6 | 188 | 7 | | 1 | 156 | 1 | | 5 | 140 | 5 | | 12 | 130 | 2 | | 4 | 105 | 4 | | 11 | 100 | 8 | | 3 | 77 | 3 | | 10 | 59 | 6 | | 2 | 33 | 9 | +----+------------+----------------+ 9 rows in set (0.00 sec) mysql> set @rank=0; Query OK, 0 rows affected (0.00 sec) mysql> select ID, sum(score), @rank:=@rank+1 from test1 group by ID; +----+------------+----------------+ | ID | sum(score) | @rank:=@rank+1 | +----+------------+----------------+ | 1 | 156 | 1 | | 2 | 33 | 9 | | 3 | 77 | 3 | | 4 | 105 | 4 | | 5 | 140 | 5 | | 6 | 188 | 7 | | 10 | 59 | 6 | | 11 | 100 | 8 | | 12 | 130 | 2 | +----+------------+----------------+ 9 rows in set (0.00 sec) mysql> alter table test1 add key(id); Query OK, 14 rows affected (0.39 sec) Records: 14 Duplicates: 0 Warnings: 0 mysql> set @rank=0; Query OK, 0 rows affected (0.00 sec) mysql> select ID, sum(score), @rank:=@rank+1 from test1 group by ID; +----+------------+----------------+ | ID | sum(score) | @rank:=@rank+1 | +----+------------+----------------+ | 1 | 156 | 1 | | 2 | 33 | 2 | | 3 | 77 | 3 | | 4 | 105 | 4 | | 5 | 140 | 5 | | 6 | 188 | 6 | | 10 | 59 | 7 | | 11 | 100 | 8 | | 12 | 130 | 9 | +----+------------+----------------+ 9 rows in set (0.00 sec) So, I'd do it like this: mysql> set @rank=0; Query OK, 0 rows affected (0.00 sec) mysql> select *, @rank:=@rank+1 from (select ID, sum(score) score from test1 group by ID order by score desc) a LIMIT 3; +----+-------+----------------+ | ID | score | @rank:=@rank+1 | +----+-------+----------------+ | 6 | 188 | 1 | | 1 | 156 | 2 | | 5 | 140 | 3 | +----+-------+----------------+ 3 rows in set (0.00 sec)
[24 Jun 2010 19:02]
richard xin
Thanks for the quick reply. I saw your examples how index change may result in different result in 5.1.41, but in 5.1.24 the results are consistently "correct" no matter what index is added although your menu said "The order of evaluation for expressions involving user variables is undefined". Does that mean 5.1.24 did the "right thing" accidentally and the 5.1.41 is how it should work now?
[24 Jun 2010 22:59]
richard xin
as far as I see based on the query result without looking into source code: 5.1.41/45 evaluates variables BEFORE Group_By, the ranks in the result are controlled by SELECT stmt’s implicit ORDERBY column(s) such as index etc before Group_By, the ranks will almost certainly never be correct when GROUPBY is present (unless the result sort order is same as SELECT default sort). But 5.1.24 is different, it evaluates session variables AFTER Group_By, that’s the reason why the result will be always correct no matter what indexes added/removed. 5.1.24 logic makes more sense to me. It looks to me that 5.1.41/45 abandoned correct way on how/when the variables get evaluated in 5.1.24, I am surprised that you don't see this as a bug.
[25 Jun 2010 17:08]
richard xin
there is also a potential performance problem for 5.1.41/45. For example, table test1 in my initial example has 1 milion records. For the same query in my initial comment, 5.1.41/45 will have to evaluate @rank 1 million times, while 5.1.24 will only do that 3 times.
[29 Jun 2010 8:46]
Sveta Smirnova
Thank you for the feedback. > Does that mean 5.1.24 did the "right thing" accidentally and the 5.1.41 is how it should work now? Not exactly. This means MySQL does not guarantee how it calculates values in such expressions. This can change based on plan which optimizer chooses to use. So behavior can change again in newer versions and show other results. Closing as "Not a Bug"
[29 Jun 2010 19:13]
richard xin
I am disappointed this is a “feature” that newer Mysql versions such as 5.1.41 and .45 abandoned good logic in 5.1.24 and used logic that could produce un-predictable outputs and potential performance problems. Anyway, that’s your call. I think it’s time for MySQL to have a function (such as row_number) which many modern database has (such as Oracle’s ROWNUM and Sql Server (v.2005+)’s Row_Number()) This is feature request. Not a bug. Thanks.
