Bug #47516 User Defined Variables documentation fuzziness
Submitted: 22 Sep 2009 11:39 Modified: 30 Nov 2009 17:30
Reporter: Shlomi Noach (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S4 (Feature request)
Version:5.0, 5.1 OS:Any
Assigned to: Paul DuBois CPU Architecture:Any
Tags: session variables, user defined variables, user variables

[22 Sep 2009 11:39] Shlomi Noach
Description:
The documentation at:
http://dev.mysql.com/doc/refman/5.0/en/user-variables.html

is fuzzy with regard to order of evaluation. The explanation:
"...The general rule is never to assign a value to a user variable in one part of a statement and use the same variable in some other part of the same statement. You might get the results you expect, but this is not guaranteed. ..."

is not clear enough. Just what is "one part" and "other part" of the statement?
For example, is the following considered to be "in the same part" of the statement:
SELECT @a := (@b := 3) + @b FROM DUAL

as it is just one column that's being retrieved?
Is the following considered to be "in the same part" of the statement?
SELECT @a + @b + 0*COALESCE(@a := 3, @b := 4) FROM mysql.user

Also missing from documentation is whether evaluation order is dependent on parenthesis: does the following imply that @a is evaluated before @b?
SELECT @a := (@b := 3) FROM DUAL

How to repeat:
[Documentation request; nothing to repeat]

Suggested fix:
I recommend adding proper documentation for the above issues, to resolve any "undefined" behavior.
[22 Sep 2009 12:32] Valeriy Kravchuk
Thank you for the documentation request.
[22 Sep 2009 15:16] Stefan Hinz
I read this as a request to define what a "part" is, which I guess isn't easy to explain (maybe because it's too trivial). Maybe adding an example (like in the bug report) might be helpful.
[22 Sep 2009 15:40] Shlomi Noach
Stefan,

Yes, indeed. I think a clarification of what a "part" is and what it isn't is in place. 
With proper explanation, it would be decided whether certain variables use cases are reliable (Well defined evaluation order) or unreliable (undefined evaluation order).

Regards
[20 Oct 2009 19:54] Paul DuBois
There's nothing fuzzy here. The manual makes no promise that you can do this in any way and rely on it.
[21 Oct 2009 12:30] Shlomi Noach
Paul,

The "this way" you're referring to is exactly the fuzzy thing.
A very wide-spread use of user variables is to do 
SELECT @counter := @counter+1, ... FROM ...

Does the manual clarify that this is OK? This is being taught in the MySQL courses I deliver. What is OK and what isn't OK?
[28 Oct 2009 14:44] Paul DuBois
The manual makes no promise that you can do this in any way and rely on it.

I know it may work in some situations, but the manual does not (and will not) promise it.
[28 Oct 2009 16:28] Shlomi Noach
Paul,

Thanks for the clarifications.
May suggest, then, that the manual clearly states that one may not update and read the same variable within the same SELECT statement (and drop all the "this part/another part" stuff?

Although I do believe this is taking the shortcut. I do hope and ask for a programmer to review the code and clearly state what is expected.

Regards
[24 Nov 2009 16:57] Paul DuBois
"May suggest, then, that the manual clearly states that one may not update and read the
same variable within the same SELECT statement (and drop all the "this part/another part"
stuff?"

Not exactly. You can do it, but you might or might not get the results you want. You seem to want the manual to guarantee something that it explicitly does not and will not guarantee?
[24 Nov 2009 17:49] Shlomi Noach
Yes, I'd like the documentation to be clear about it.
As a developer, I'm certain that within reasonable effort, the MySQL developers can come up with a strict answer on all the above questions. Ideally, I'd like to have these answers.
If not, I'd like to have the documentation specify more clearly what's known to work, what's known not to work, and what's vague.
Currently, as I understand it, the documentation in itself is vague.
I hope you get my meaning.

Regards
[24 Nov 2009 18:31] Davi Arnaut
Shlomi,

The order of evaluation of a statement depends on the statement itself and on the transformations (optimizations) that are applied upon it. As developers we can't predict with certainly the evaluation order of every single statement, consequently this is considered undefined behavior and discouraged. Anyone will have a hard time determining exactly the evaluation order of a statement -- there are too many dependencies and, what is worse, at runtime. For the sake of example, the optimizer (via heuristics) might decide to eliminate a part of the statement and this could cause a assignment to a variable to be chopped.
[24 Nov 2009 18:42] Davi Arnaut
As with any other computer language, one could rely on the natural order of evaluation of operators (as defined in the SQL standard), but this is something that the user has to understand by himself and for each statement -- by determining how the thing is going to be evaluated and its possible side effects.

Sure, you can assign to a variable at the same time another part of the statement reads it and expect it to work by prediction the order of evaluation and transformations. But if you shoot yourself in the foot, don't blame others.
[24 Nov 2009 20:27] Shlomi Noach
Davi, Paul,

Thanks both for your elaborate comments. I think I have a clearer picture of this issue now.
May I suggest adding to the documentation a paragraph in the spirit of Davi's last two comments?
Otherwise I am content in closing this bug.

Regards
[30 Nov 2009 17:30] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.