Bug #9228 optimizer on alternating function with no ref to columns
Submitted: 16 Mar 2005 18:37 Modified: 26 Apr 2005 21:31
Reporter: Martin Friebe (Gold Quality Contributor) (OCA) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:4.1.10 OS:Any (*)
Assigned to: Sergey Petrunya CPU Architecture:Any

[16 Mar 2005 18:37] Martin Friebe
Description:
distinct does not display all unique values in the query below

the error happens independent from the initial state of the lock, the 2nd query show either one (but always only one) of the 2 possible values.

The error also happens, if "group by 1" instead of distinct is used.

The workaround shows, that if a reference to a column apears in the function (even in a place that doesnt affect the result) the query succeeds.

Seems the optimizer does believe that a fixed-string function, can not alter its value during a query (except, if it contains rand())

How to repeat:
# the following query, works, and shows there are two values in the result set.

select if( is_free_lock("A"),  get_lock('A',1)  , concat("R",release_lock('A')) ) from (select 1 union select 2 union select 3) t; 

+----------------------------------------------------------------------------+
| if( is_free_lock("A"),  get_lock('A',5)  , concat("R",release_lock('A')) ) |
+----------------------------------------------------------------------------+
| 1                                                                          |
| R1                                                                         |
| 1                                                                          |
+----------------------------------------------------------------------------+
3 rows in set (0.00 sec)

# same query with distinct

select distinct if( is_free_lock("A"),  get_lock('A',5)  , concat("R",release_lock('A')) ) from (select 1 union select 2 union select 3) t;

+----------------------------------------------------------------------------+
| if( is_free_lock("A"),  get_lock('A',5)  , concat("R",release_lock('A')) ) |
+----------------------------------------------------------------------------+
| R1                                                                         |
+----------------------------------------------------------------------------+

# workaround
select distinct if( is_free_lock("A"),  get_lock('A',a)  , concat("R",release_lock('A')) ) from (select 1 a union select 2 union select 3) t;

+----------------------------------------------------------------------------+
| if( is_free_lock("A"),  get_lock('A',a)  , concat("R",release_lock('A')) ) |
+----------------------------------------------------------------------------+
| R1                                                                         |
| 1                                                                          |
+----------------------------------------------------------------------------+

Suggested fix:
-
[16 Mar 2005 19:17] Martin Friebe
the same or related error ocurs with a function (no colum referenced, alernating value) in the where clause.

do release_lock("A");
select if( is_free_lock("A"),  get_lock('A',a)  , concat("R",release_lock('A')) ) from (select 1 a union select 2 union select 3) t where  is_free_lock("A");
+----------------------------------------------------------------------------+
| if( is_free_lock("A"),  get_lock('A',a)  , concat("R",release_lock('A')) ) |
+----------------------------------------------------------------------------+
| 1                                                                          |
| R1                                                                         |
| 1                                                                          |
+----------------------------------------------------------------------------+
# the 2nd row shouldnt be there (but the "where" condition was true, pre query executation"

Note: the query might be ok, if only one row, gets returned, since the where is false for the 2nd row, retrieval of the 2nd row is not required, and the where continues to be false.
It seems to be a question of definition, what to expect, except for the 2nd row must not be in the result.

do get_lock("A",1);
select if( is_free_lock("A"),  get_lock('A',a)  , concat("R",release_lock('A')) ) from (select 1 a union select 2 union select 3) t where  is_free_lock("A");
Empty set (0.00 sec)
# this can be treated as correct, depending on the note above

#
[16 Mar 2005 19:22] Martin Friebe
description was wrong, not necessarly related to distinct
[17 Mar 2005 4:02] Jorge del Conde
Thanks for your bug report.

Verified w/4.1.11 from bk
[26 Apr 2005 21:31] Sergey Petrunya
Martin,
  In my reading of the manual, MySQL doesn't give warranties about when and in which order it will evaluate the functions in the query. 
 The presense of side effects in get_lock and release_lock functions makes them an exception. This exception was made for the sake of doing simple things like 
SELECT get_lock(...); 
...
SELECT release_lock(...). 
I don't see how one can get useful functionality when using the above functions in both selection list and the WHERE clause of a non-trivial SELECT (and especially with DISTINCT). 

So I don't consider the above behavior to be a bug. Also I think documentation is adequate.
If you can point to spot where it is not, please reopen the bug.

Thanks for taking time to write to us.
[27 Apr 2005 12:55] Martin Friebe
Ok, I am not sure, if this is a documentation thing, or an execution bug.
On the question of usability, there are possible thoughts (like alternating flags).

On the order of execution, I did not assume that mysql will (depending on various option, like in this case the presence or non presence of a "distinct", execute both of the possible IF result expression, and choose from theire results, but fair enough that is possible)

The bug simple is that adding "distinct" into any query, should not alter the way it works, except for filtering the result to be unique.

The query can be rewritten, not to contain the dependency on order.

 do release_lock("A");
 select  get_lock( if( is_free_lock('A') , 'A', null),0) from (select 1 union select 2 union select 3) t;
+----------------------------------------------+
| get_lock(if( is_free_lock('A') ,'A',null),0) |
+----------------------------------------------+
|                                            1 |
|                                         NULL |
|                                         NULL |
+----------------------------------------------+

distinct will again return only one row. (containing "1").

If this result is expected, the this means that a query can not (on later retrieved rows) refer to the status of any user-lock it has modified before. The user-lock manpage should contain a warning that modifying and retrieving information about the same userlock, is unsafe.

Otherwise, the optimizer(I assume) should consider the outcome of the get_lock, as similiar to rand() and return the correct result for distinct.