Bug #9552 User variables does not work with aggregate functions
Submitted: 1 Apr 2005 7:28 Modified: 14 Apr 2005 16:49
Reporter: Mattias Jiderhamn Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:4.0.21, 4.1.8 OS:Windows (XP, RedHat)
Assigned to: Paul DuBois CPU Architecture:Any

[1 Apr 2005 7:28] Mattias Jiderhamn
Description:
When assigning user variables with the value of an aggregate function, the variable is not set until after the query has finished.

How to repeat:
CREATE TABLE foo (
  id INT,
  value INT
);

INSERT INTO foo (id, value) VALUES (1, 10);
INSERT INTO foo (id, value) VALUES (1, 20);
INSERT INTO foo (id, value) VALUES (2, 100);
INSERT INTO foo (id, value) VALUES (2, 200);

SELECT @id_var:=id, @value_var:=SUM(value), @id_var, @value_var
FROM foo
GROUP BY id;

Expected result:

+-------------+------------------------+---------+------------+
| @id_var:=id | @value_var:=SUM(value) | @id_var | @value_var |
+-------------+------------------------+---------+------------+
|           1 |                     30 |       1 |         30 |
|           2 |                    300 |       2 |        300 |
+-------------+------------------------+---------+------------+

Actual result:

+-------------+------------------------+---------+------------+
| @id_var:=id | @value_var:=SUM(value) | @id_var | @value_var |
+-------------+------------------------+---------+------------+
|           1 |                     30 | 1       | NULL       |
|           2 |                    300 | 2       | NULL       |
+-------------+------------------------+---------+------------+

If run again, the result is:

+-------------+------------------------+---------+------------+
| @id_var:=id | @value_var:=SUM(value) | @id_var | @value_var |
+-------------+------------------------+---------+------------+
|           1 |                     30 |       1 |        300 |
|           2 |                    300 |       2 |        300 |
+-------------+------------------------+---------+------------+

Suggested fix:
Make user variables work with aggregate functions just as with normal columns, so they can be used in per result record calculations (i.e. @value_var * 3 AS triple_value)
[1 Apr 2005 7:56] Mattias Jiderhamn
There are also problems using aggregate functions together with non aggregate user variables. Example, using given table and values:

SELECT @id_var:=id, @id_var * SUM(value) AS stupid_sum
FROM foo
GROUP BY id;

Expected result

+-------------+------------+
| @id_var:=id | stupid_sum |
+-------------+------------+
|           1 |         30 |
|           2 |        600 |
+-------------+------------+

Actual result

+-------------+------------+
| @id_var:=id | stupid_sum |
+-------------+------------+
|           1 |       NULL |
|           2 |        300 |
+-------------+------------+

If run again, the result is:

+-------------+------------+
| @id_var:=id | stupid_sum |
+-------------+------------+
|           1 |         60 |
|           2 |        300 |
+-------------+------------+

So in this case it uses the correct aggregate value, but the user variable value from the previous record.
[5 Apr 2005 4:50] MySQL Verification Team
According with the Manual:  http://dev.mysql.com/doc/mysql/en/variables.html

´´´´´´

If you refer to a variable that has not been initialized, its value is NULL.

<cut>

Note: In a SELECT statement, each expression is evaluated only when sent to the client. This means that in a HAVING, GROUP BY, or ORDER BY clause, you cannot refer to an expression that involves variables that are set in the SELECT list. For example, the following statement does not work as expected:

mysql> SELECT (@aa:=id) AS a, (@aa+3) AS b FROM tbl_name HAVING b=5;

The reference to b in the HAVING clause refers to an alias for an expression in the SELECT list that uses @aa. This does not work as expected: @aa does not contain the value of the current row, but the value of id from the previous selected row.

The general rule is to never assign and use the same variable in the same statement. 

...........................................
[5 Apr 2005 6:40] Mattias Jiderhamn
This might not be a bug, but in that case it needs better documentation.

Please note that I am not using the variable in a HAVING, GROUP BY or ORDER BY clause. Also note that what I try to do works perfectly fine with variables assigned from database columns, but not for variables from aggregate functions. For column values, the variable seems to be assigned "per record" while for aggregate functions it seems to be "per query". I cannot find this documented.

As a side note - proving it works just fine for column values - the following code  assignings "per parent unique" numbers to child records.

SELECT @counter:=0;
SELECT @last_parent_id:='0';
UPDATE children SET children_id=IF(@last_parent_id = parent_id,@counter:=@counter+1,@counter:=1), parent_id=(@last_parent_id:=parent_id);
[11 Apr 2005 11:04] Hartmut Holzgraefe
> Please note that I am not using the variable in a HAVING, GROUP BY 
> or ORDER BY clause.
...
> For column values, the variable seems to be assigned "per record"
> while for aggregate functions it seems to be "per query".

Aggregate function values are calculated per GROUP, so the GROUP BY 
restriction applies here, too.
[11 Apr 2005 11:08] Mattias Jiderhamn
Yes. And that needs better documenting.
[11 Apr 2005 11:28] Mattias Jiderhamn
[Status change]
[14 Apr 2005 16:49] Paul DuBois
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.mysql.com/documentation/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

Additional info:

The cases in question are covered by the existing
statement in the manual:

The general rule is to never assign and use the same
variable in the same statement.

The expected result might be obtained in some cases,
but there is no guarantee that this will happen.