Bug #6172 RAND(a) should only accept constant values as arguments.
Submitted: 20 Oct 2004 8:50 Modified: 12 Feb 2007 18:43
Reporter: Konstantin Osipov (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:All OS:Any (All)
Assigned to: Georgi Kodinov CPU Architecture:Any

[20 Oct 2004 8:50] Konstantin Osipov
Description:
Dlenev writes:
I see one problem here (it also presents in old code).
We call val_int() method for possibly non-const item in fix_fields(),
and I don't think that there is guarantee that it is safe to do this...

Particularly if we have field as argument to RAND function the value will
depend on such factor as state of TABLE::record[0] buffer... Thus
RAND(field) function starts behaving unpredictable for user.

The following example demonstrates problem:

create table rt (i int, j int);

insert into rt values (1,1),(2,2);

select rand(2), rand(i) from rt where j=1;
+------------------+------------------+
| rand(2)          | rand(i)          |
+------------------+------------------+
| 0.65558664654902 | 0.65558664654902 |
+------------------+------------------+

// BTW one can falsely assume that he should get different values here

insert into rt values (3,3);

select rand(2), rand(i) from rt where j=1;
+------------------+------------------+
| rand(2)          | rand(i)          |
+------------------+------------------+
| 0.65558664654902 | 0.90576975597606 |
+------------------+------------------+

Actually looking at the current semantic of RAND(N) function (random seed
is initialized only once per query) I don't think it is sensible to allow
non-constant parameters for it...

So I suggest to add check that will allow only constants during query
execution to be parameters for RAND() function (I think this also will
make semantic of RAND(N) a bit clearer for users).

How to repeat:
See description.
[20 Jul 2005 11:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/27367
[21 Jul 2005 8:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/27411
[6 Sep 2005 11:19] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/29359
[6 Sep 2005 11:21] Sergei Glukhov
Fixed in 4.1.15
[12 Sep 2005 17:52] Mark Matthews
Unfortunately, this new behavior is incompatible w/ SQLServer, MS Access and DB2 (at least the databases I could find information for a RAND() function).

Here's the documentation from DB2, with regards to the single, optional argument to RAND():

 "If an expression is specified, it is used as the seed value. The expression must be a built-in SMALLINT or INTEGER data type with a value between 0 and 2 147 483 647."

This change happens to break the JDBC compliance testsuite which assumes that you can pass a column in for this argument. I can re-configure the query, but it did end up pointing out an incompatibility w/ the RDBMSs I mentioned above, if not others.

I'm re-opening this, as PeterG notes the following:

"pgulutzan | mark: bug#6172 reads as "this isn't working right, so let's disallow it" instead of "this isn't working right, so let's fix it". besides, it's not a documented limitation, in the section that i read. so i agree, you should re-open it or make a new one."
[15 Nov 2005 16:11] [ name withheld ]
This "fix" broke my application. I was happy with the previous behavior. Now I don't know how to fix my application.
[26 Mar 2006 8:29] Four Shades
Yes, this bug has broken my application too. I can't understand what the problem is with allow non-constant arguments to RAND(). How else can you generate repeatable random numbers based on fields in a recordset? It works fine in MS Access for goodness sake!
[26 Mar 2006 8:55] Konstantin Osipov
The assumption in the above comment is incorrect. The value of a non-constant argument to RAND has never been used (in no version of MySQL). Instead, garbage that happened to be in table->record[] at the moment of fix_fields() was used to initialize the random number generator. We could perhaps allow the syntax for non-constant arguments, but the semantics will be at best constant, at worst undefined.
[26 Mar 2006 9:00] Konstantin Osipov
After musing a little more about it, what we could do in 4.1:
 - allow the old syntax. In case the argument is not a constant, use an internal number (e.g. query_id) to initialize the seed (once per query).
 - in a alpha/beta release: if the argument is not a constant, use its value to reinitialize the seed on every invocation of rand.
[16 Jan 2007 11:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18177

ChangeSet@1.2392, 2007-01-16 13:44:06+02:00, gkodinov@macbook.gmz +4 -0
  Bug #6172: RAND(a) should only accept constant values as arguments
   RAND() must accept scalar expressions regardless of their kind.
   That includes both constant expressions and expressions that 
   depend on column values.
   When the expression is constant the random seed can be initialized
   at compile time.
   However when the expression is not constant the random seed must be
   initialized at each invocation (while it still can be allocated at
   compile time).
   Implemented the above rules by extending Item_func_rand::val_real()
   to initialize the random seed at the correct place.
[18 Jan 2007 11:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18325

ChangeSet@1.2392, 2007-01-18 13:25:13+02:00, gkodinov@macbook.gmz +4 -0
  Bug #6172: RAND(a) should only accept constant values as arguments
   RAND() must accept scalar expressions regardless of their kind.
   That includes both constant expressions and expressions that 
   depend on column values.
   When the expression is constant the random seed can be initialized
   at compile time.
   However when the expression is not constant the random seed must be
   initialized at each invocation (while it still can be allocated at
   compile time).
   Implemented the above rules by extending Item_func_rand::val_real()
   to initialize the random seed at the correct place.
[18 Jan 2007 11:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18326

ChangeSet@1.2392, 2007-01-18 13:34:45+02:00, gkodinov@macbook.gmz +4 -0
  Bug #6172: RAND(a) should only accept constant values as arguments
   RAND() must accept scalar expressions regardless of their kind.
   That includes both constant expressions and expressions that 
   depend on column values.
   When the expression is constant the random seed can be initialized
   at compile time.
   However when the expression is not constant the random seed must be
   initialized at each invocation (while it still can be allocated at
   compile time).
   Implemented the above rules by extending Item_func_rand::val_real()
   to initialize the random seed at the correct place.
[18 Jan 2007 12:07] Konstantin Osipov
Approved the second patch by email.
[22 Jan 2007 9:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18521

ChangeSet@1.2392, 2007-01-22 11:50:21+02:00, gkodinov@macbook.gmz +4 -0
  Bug #6172: RAND(a) should only accept constant values as arguments
   RAND() must accept scalar expressions regardless of their kind.
   That includes both constant expressions and expressions that 
   depend on column values.
   When the expression is constant the random seed can be initialized
   at compile time.
   However when the expression is not constant the random seed must be
   initialized at each invocation (while it still can be allocated at
   compile time).
   Implemented the above rules by extending Item_func_rand::val_real()
   to initialize the random seed at the correct place.
[28 Jan 2007 2:08] Igor Babaev
The fix has been pushed into 5.0.36, 5.1.16-beta main trees.
[28 Jan 2007 2:12] Igor Babaev
A correction:
The fix has been pushed into 5.1.16-beta main tree only.
[12 Feb 2007 18:43] Paul DuBois
Noted in 4.1.15, 5.0.13, 5.1.3, 5.1.16 changelogs.

Before 4.1.15/5.0.13/5.1.3:
- RAND() with a constant initializer initializes the seed once when the
query is compiled, prior to query execution.
- The effect of a non-constant initializer is undefined.

As of 4.1.15/5.0.13/5.1.3:
- Only constant initializers are accepted, and they have the same effect
as before.

For 5.1, as of 5.1.16:
- Constant initializers have the same effect as before.
- For non-constant initializers (such as a column name), the seed is
initialized with the initializer for each invocation of RAND().  One
implication of this is that for equal argument values, RAND() will
return the same value each time.