Bug #69374 User Variable not updating properly in some cases
Submitted: 1 Jun 2013 6:49 Modified: 31 Aug 2014 5:24
Reporter: GREG WOOLSEY Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Parser Severity:S2 (Serious)
Version:5.6 OS:Any
Assigned to: CPU Architecture:Any

[1 Jun 2013 6:49] GREG WOOLSEY
Description:
I have discovered a case where user variables are not properly updated when evaluating rows in a query.  The below example SQL runs correctly in 5.1 on multiple platforms, and fails consistently on 5.5.27 (the only 5.5 version I have access to at the moment).  I do not see anything in the 5.5.x release notes between 5.5.28 and 5.5.31 (current) that relate to user variables.

I've included both the code that shows the problem and the tweak that provides me a workaround in the "how to repeat" section.

I'm not sure which parts of my test case are key to the bug, I've tried to slim it down to bare essentials.  I can't tell if the variables are interacting poorly with the CASE expression, CONCAT() function, or something else.

The workaround is to reference the variables when setting them, by multiplying the current value by 0 and adding that to the desired new value.  This logical null-op does something that fixes the bug.

How to repeat:
/* example showing the problem */
drop table if exists v_test;
create table v_test (
dsc varchar(20),
start_1 int,
end_1 int,
val int
);
insert into v_test
 values
 ('within', 10, 20, 19)
,('before', 10, 20, 01)
,('within', 10, 20, 19)
,('after', 10, 20, 30)
;

select hsum.dsc, @prior, @during, @after
from (select @rowVal:=-1, @prior:=0.0,@during:=0.0,@after:=0.0 ) x, 
 v_test as hsum
         where ifnull(
                case /* setup */
                     when (@rowVal:=hsum.val) is null
                     then concat(@prior:=0.0,@during:=0.0,@after:=0.0)
                     /* before */
                     when hsum.start_1 >= @rowVal
                     then concat(@prior:=(1.0),@during:=0.0,@after:=0.0)
                     /* within */
                     when hsum.start_1 < @rowVal and hsum.end_1 > @rowVal
                     then concat(@prior:=(0.0),@during:=(1.0),@after:=0.0)
                    /* after */
                     when hsum.end_1 <= @rowVal
                     then concat(@prior:=0.0,@during:=0.0,@after:=(1.0))
                      else concat(@prior:=0,@during:=0,@after:=0)
                end
              ,concat(@prior:=0,@during:=0,@after:=0))
       is not null
;

/* RESULTS:
+--------+--------+---------+--------+
| dsc    | @prior | @during | @after |
+--------+--------+---------+--------+
| within |    0.0 |     1.0 |    0.0 |
| before |    1.0 |     0.0 |    0.0 |
| within |    1.0 |     0.0 |    0.0 | <<<<< WRONG 
| after  |    0.0 |     0.0 |    1.0 |
+--------+--------+---------+--------+
*/

/* QUERY WITH WORKAROUND */
select hsum.dsc, @prior, @during, @after
from (select @rowVal:=-1, @prior:=0.0,@during:=0.0,@after:=0.0 ) x, 
 v_test as hsum
         where ifnull(
                case /* setup */
                     when (@rowVal:=hsum.val) is null
                     then concat(@prior:=(0.0*@prior),@during:=(0.0*@during),@after:=(0.0*@after))
                     /* before */
                     when hsum.start_1 >= @rowVal
                     then concat(@prior:=(0.0*@prior + 1.0),@during:=(0.0*@during),@after:=(0.0*@after))
                     /* within */
                     when hsum.start_1 < @rowVal and hsum.end_1 > @rowVal
                     then concat(@prior:=(0.0*@prior),@during:=(0.0*@during + 1.0),@after:=(0.0*@after))
                    /* after */
                     when hsum.end_1 <= @rowVal
                     then concat(@prior:=(0.0*@prior),@during:=(0.0*@during),@after:=(0.0*@after + 1.0))
                      else concat(@prior:=(0.0*@prior),@during:=(0.0*@during),@after:=(0.0*@after))
                end
              ,concat(@prior:=(0.0*@prior),@during:=(0.0*@during),@after:=(0.0*@after)))
       is not null
;

/* RESULTS (don't care about increasing precision)
+--------+---------+---------+---------+
| dsc    | @prior  | @during | @after  |
+--------+---------+---------+---------+
| within |    0.00 |    1.00 |    0.00 |
| before |   1.000 |   0.000 |   0.000 |
| within |  0.0000 |  1.0000 |  0.0000 | <<<<< CORRECT
| after  | 0.00000 | 0.00000 | 1.00000 |
+--------+---------+---------+---------+
*/

Suggested fix:
The parse/execution engine should correctly update the user variables for each WHERE clause row evaluation, as it does in 5.1.
[1 Jun 2013 15:47] MySQL Verification Team
Thank you for the bug report and good test case.

[miguel@tikal 5.5]$ bin/mysql -uroot test
<cut>
Server version: 5.5.33-debug Source distribution         

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective    
owners.                                                          

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> /* example showing the problem */
<cut>
    ->                 end
    ->               ,concat(@prior:=0,@during:=0,@after:=0))
    ->        is not null
    -> ;
+--------+--------+---------+--------+
| dsc    | @prior | @during | @after |
+--------+--------+---------+--------+
| within |    0.0 |     1.0 |    0.0 |
| before |    1.0 |     0.0 |    0.0 |
| within |    1.0 |     0.0 |    0.0 |
| after  |    0.0 |     0.0 |    1.0 |
+--------+--------+---------+--------+
4 rows in set (0.01 sec)

You can turn off this feature to get a quicker startup with -A    

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1                            
Server version: 5.6.13-debug Source distribution         

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
    ->        is not null
    -> ;

<cut>

+--------+--------+---------+--------+
| dsc    | @prior | @during | @after |
+--------+--------+---------+--------+
| within | 0.0    | 1.0     | 0.0    |
| before | 1.0    | 0.0     | 0.0    |
| within | 1.0    | 0.0     | 0.0    |
| after  | 0.0    | 0.0     | 1.0    |
+--------+--------+---------+--------+
4 rows in set (0.00 sec)

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1                            
Server version: 5.1.71-debug Source distribution         

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective    
owners.                                                          

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> drop table if exists v_test;
Query OK, 0 rows affected (0.00 sec)

mysql> create table v_test (
    -> dsc varchar(20),     
    -> start_1 int,         
    -> end_1 int,           
    -> val int              
    -> );                   
Query OK, 0 rows affected (0.09 sec)

mysql> insert into v_test
    ->  values           
    ->  ('within', 10, 20, 19)
    -> ,('before', 10, 20, 01)
    -> ,('within', 10, 20, 19)
    -> ,('after', 10, 20, 30)
    -> ;
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql>
mysql> select hsum.dsc, @prior, @during, @after
    -> from (select @rowVal:=-1, @prior:=0.0,@during:=0.0,@after:=0.0 ) x,
    ->  v_test as hsum
    ->          where ifnull(
    ->                 case /* setup */
    ->                      when (@rowVal:=hsum.val) is null
    ->                      then concat(@prior:=0.0,@during:=0.0,@after:=0.0)
    ->                      /* before */
    ->                      when hsum.start_1 >= @rowVal
    ->                      then concat(@prior:=(1.0),@during:=0.0,@after:=0.0)
    ->                      /* within */
    ->                      when hsum.start_1 < @rowVal and hsum.end_1 > @rowVal
    ->                      then concat(@prior:=(0.0),@during:=(1.0),@after:=0.0)
    ->                     /* after */
    ->                      when hsum.end_1 <= @rowVal
    ->                      then concat(@prior:=0.0,@during:=0.0,@after:=(1.0))
    ->                       else concat(@prior:=0,@during:=0,@after:=0)
    ->                 end
    ->               ,concat(@prior:=0,@during:=0,@after:=0))
    ->        is not null
    -> ;
+--------+--------+---------+--------+
| dsc    | @prior | @during | @after |
+--------+--------+---------+--------+
| within |    0.0 |     1.0 |    0.0 |
| before |    1.0 |     0.0 |    0.0 |
| within |    0.0 |     1.0 |    0.0 |
| after  |    0.0 |     0.0 |    1.0 |
+--------+--------+---------+--------+
4 rows in set (0.00 sec)

mysql>
[3 Jul 2013 5:47] GREG WOOLSEY
Also fails on MariaDB 5.5.30 (tested via Jelastic trial account).
[31 Aug 2014 5:24] GREG WOOLSEY
I ran into this again, and found a much simpler test case that shows exactly where the query optimizer, parser, and engine are failing (aside from actually digging into the code, as C isn't my forte).

This test is self-explanatory - it should label the column values as 'odd' or 'even'.  It works in 5.1, fails in 5.6, showing 3 as 'even'.

select c, @test
  from 
       (select @test:='') v,
       (select 1 as c union all select 2 as c union all select 3 as c union all select 4 as c) x
 where
       IF(mod(c, 2), @test:='odd', @test:='even') is not null
;

To fix this in 5.6, you must either reference the variable or a table column in the expression.

For example, using a no-op by combining NULLIF and IFNULL:

select @test
  from 
       (select @test:='') v,
       (select 1 as c union all select 2 as c union all select 3 as c union all select 4 as c) x
 where
       IF(mod(c, 2), @test:=ifnull(nullif(@test, @test),'odd'), @test:=ifnull(nullif(@test, @test),'even')) is not null
;

It looks like the problem is in some optimization code around SQL variable expression evaluation.  Someone tried to get fancy and track whether an expression evaluated to a constant result, and if so, only update the variable once from that expression.  However, this logic fails to account for queries with Control Flow functions such as IF() and CASE...END.  With these expressions it is expected to have different rows reach different value expressions.  If these different expressions all update the same variable, you get this problem.

Do not tell me the test case can be re-written, of course it can, it is the simplest query that I could find that shows the problem.  i could just put the IF(MOD()...) in the SELECT clause, but that's not what I need to do in real life.  In reality, the value returned by the complex Control Flow statement, either IF() or CASE...END, is computationally expensive over millions of rows, and needed multiple times to evaluate multiple return values in the SELECT clause.  Assigning its result to a variable in the WHERE clause allows for evaluation once per row, as long as the other internal variable expressions are actually re-evaluated properly.

Please revisit the user-variable expression optimization, and refine it to know about multiple assignments from different end points of Control Flow functions.

In the mean time, I'll have to stick with 

IFNULL(NULLIF(column_or_var, same_column_or_var), what_I_really_want)

and remember that every time something goes wrong.