Bug #6172 RAND(a) should only accept constant values as arguments.
Submitted: 20 Oct 2004 10:50 Modified: 12 Feb 2007 19:43
Reporter: Konstantin Osipov
Status: Closed
Category:Server Severity:S3 (Non-critical)
Version:All OS:Any (All)
Assigned to: Georgi Kodinov Target Version:

[20 Oct 2004 10:50] Konstantin Osipov
Description:
Dlenev writes:
I see one problem here (it also presents in old code).
We call val_int() method for possibly non-const item in fix_fields(),
and I don't think that there is guarantee that it is safe to do this...

Particularly if we have field as argument to RAND function the value will
depend on such factor as state of TABLE::record[0] buffer... Thus
RAND(field) function starts behaving unpredictable for user.

The following example demonstrates problem:

create table rt (i int, j int);

insert into rt values (1,1),(2,2);

select rand(2), rand(i) from rt where j=1;
+------------------+------------------+
| rand(2)          | rand(i)          |
+------------------+------------------+
| 0.65558664654902 | 0.65558664654902 |
+------------------+------------------+

// BTW one can falsely assume that he should get different values here

insert into rt values (3,3);

select rand(2), rand(i) from rt where j=1;
+------------------+------------------+
| rand(2)          | rand(i)          |
+------------------+------------------+
| 0.65558664654902 | 0.90576975597606 |
+------------------+------------------+

Actually looking at the current semantic of RAND(N) function (random seed
is initialized only once per query) I don't think it is sensible to allow
non-constant parameters for it...

So I suggest to add check that will allow only constants during query
execution to be parameters for RAND() function (I think this also will
make semantic of RAND(N) a bit clearer for users).

How to repeat:
See description.
[20 Jul 2005 13:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/27367
[21 Jul 2005 10:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/27411
[6 Sep 2005 13:19] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/29359
[6 Sep 2005 13:21] Sergey Gluhov
Fixed in 4.1.15
[12 Sep 2005 19:52] Mark Matthews
Unfortunately, this new behavior is incompatible w/ SQLServer, MS Access and DB2 (at least
the databases I could find information for a RAND() function).

Here's the documentation from DB2, with regards to the single, optional argument to
RAND():

 "If an expression is specified, it is used as the seed value. The expression must be a
built-in SMALLINT or INTEGER data type with a value between 0 and 2 147 483 647."

This change happens to break the JDBC compliance testsuite which assumes that you can pass
a column in for this argument. I can re-configure the query, but it did end up pointing
out an incompatibility w/ the RDBMSs I mentioned above, if not others.

I'm re-opening this, as PeterG notes the following:

"pgulutzan | mark: bug#6172 reads as "this isn't working right, so let's disallow it"
instead of "this isn't working right, so let's fix it". besides, it's not a documented
limitation, in the section that i read. so i agree, you should re-open it or make a new
one."
[15 Nov 2005 17:11] [ name withheld ]
This "fix" broke my application. I was happy with the previous behavior. Now I don't know
how to fix my application.
[26 Mar 2006 10:29] Four Shades
Yes, this bug has broken my application too. I can't understand what the problem is with
allow non-constant arguments to RAND(). How else can you generate repeatable random
numbers based on fields in a recordset? It works fine in MS Access for goodness sake!
[26 Mar 2006 10:55] Konstantin Osipov
The assumption in the above comment is incorrect. The value of a non-constant argument to
RAND has never been used (in no version of MySQL). Instead, garbage that happened to be in
table->record[] at the moment of fix_fields() was used to initialize the random number
generator. We could perhaps allow the syntax for non-constant arguments, but the semantics
will be at best constant, at worst undefined.
[26 Mar 2006 11:00] Konstantin Osipov
After musing a little more about it, what we could do in 4.1:
 - allow the old syntax. In case the argument is not a constant, use an internal number
(e.g. query_id) to initialize the seed (once per query).
 - in a alpha/beta release: if the argument is not a constant, use its value to
reinitialize the seed on every invocation of rand.
[16 Jan 2007 12:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18177

ChangeSet@1.2392, 2007-01-16 13:44:06+02:00, gkodinov@macbook.gmz +4 -0
  Bug #6172: RAND(a) should only accept constant values as arguments
   RAND() must accept scalar expressions regardless of their kind.
   That includes both constant expressions and expressions that 
   depend on column values.
   When the expression is constant the random seed can be initialized
   at compile time.
   However when the expression is not constant the random seed must be
   initialized at each invocation (while it still can be allocated at
   compile time).
   Implemented the above rules by extending Item_func_rand::val_real()
   to initialize the random seed at the correct place.
[18 Jan 2007 12:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18325

ChangeSet@1.2392, 2007-01-18 13:25:13+02:00, gkodinov@macbook.gmz +4 -0
  Bug #6172: RAND(a) should only accept constant values as arguments
   RAND() must accept scalar expressions regardless of their kind.
   That includes both constant expressions and expressions that 
   depend on column values.
   When the expression is constant the random seed can be initialized
   at compile time.
   However when the expression is not constant the random seed must be
   initialized at each invocation (while it still can be allocated at
   compile time).
   Implemented the above rules by extending Item_func_rand::val_real()
   to initialize the random seed at the correct place.
[18 Jan 2007 12:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18326

ChangeSet@1.2392, 2007-01-18 13:34:45+02:00, gkodinov@macbook.gmz +4 -0
  Bug #6172: RAND(a) should only accept constant values as arguments
   RAND() must accept scalar expressions regardless of their kind.
   That includes both constant expressions and expressions that 
   depend on column values.
   When the expression is constant the random seed can be initialized
   at compile time.
   However when the expression is not constant the random seed must be
   initialized at each invocation (while it still can be allocated at
   compile time).
   Implemented the above rules by extending Item_func_rand::val_real()
   to initialize the random seed at the correct place.
[18 Jan 2007 13:07] Konstantin Osipov
Approved the second patch by email.
[22 Jan 2007 10:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18521

ChangeSet@1.2392, 2007-01-22 11:50:21+02:00, gkodinov@macbook.gmz +4 -0
  Bug #6172: RAND(a) should only accept constant values as arguments
   RAND() must accept scalar expressions regardless of their kind.
   That includes both constant expressions and expressions that 
   depend on column values.
   When the expression is constant the random seed can be initialized
   at compile time.
   However when the expression is not constant the random seed must be
   initialized at each invocation (while it still can be allocated at
   compile time).
   Implemented the above rules by extending Item_func_rand::val_real()
   to initialize the random seed at the correct place.
[28 Jan 2007 3:08] Igor Babaev
The fix has been pushed into 5.0.36, 5.1.16-beta main trees.
[28 Jan 2007 3:12] Igor Babaev
A correction:
The fix has been pushed into 5.1.16-beta main tree only.
[12 Feb 2007 19:43] Paul DuBois
Noted in 4.1.15, 5.0.13, 5.1.3, 5.1.16 changelogs.

Before 4.1.15/5.0.13/5.1.3:
- RAND() with a constant initializer initializes the seed once when the
query is compiled, prior to query execution.
- The effect of a non-constant initializer is undefined.

As of 4.1.15/5.0.13/5.1.3:
- Only constant initializers are accepted, and they have the same effect
as before.

For 5.1, as of 5.1.16:
- Constant initializers have the same effect as before.
- For non-constant initializers (such as a column name), the seed is
initialized with the initializer for each invocation of RAND().  One
implication of this is that for equal argument values, RAND() will
return the same value each time.