Bug #17502 RAND() Function Documentation Error
Submitted: 17 Feb 2006 5:50 Modified: 26 Jan 2007 16:32
Reporter: Ben Birch Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S4 (Feature request)
Version:5.0 and earlier OS:Any (All)
Assigned to: Paul DuBois CPU Architecture:Any

[17 Feb 2006 5:50] Ben Birch
Description:
Rand Documentation states:

Returns a random floating-point value v between 0  and 1 inclusive (that is, in the range 0 <= v  <= 1.0)
....
To obtain a random integer R in the range i <= R <= j, use the expression FLOOR(i + RAND() * (j – i + 1)). For example, to obtain a random integer in the range of 7 to 12 inclusive, you could use the following statement:

If Rand() can actually produce a value of 1, then the function above would occasionally produce a value of j+1.

I think this is a documentation error that either, Rand can not produce a value of 1 or the above function is incorrect.

How to repeat:
Error is self evident from documentation

Suggested fix:
It would be better perhaps if RAND() could not produce a value of 1 but rather produced a range of 0 <= V < 1. Either that, or the documentation needs to be fixed.

Perhaps this function is more accurate
FLOOR(i + MOD(RAND(), 1) * (j – i + 1))

though it produces a slight bias towards a 0 result.
[17 Feb 2006 14:51] Valeriy Kravchuk
Thank you for a problem report. Looks like http://dev.mysql.com/doc/refman/5.0/en/mathematical-functions.html should be corrected.
[8 Mar 2006 21:25] Kjetil Watnedal
As far as I see http://dev.mysql.com/doc/refman/5.0/en/mathematical-functions.html is still NOT correct.
Same goes for http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html
[9 Mar 2006 10:14] Valeriy Kravchuk
Thank you for your persistence and additional information. When this problem will be solved, this bug report will be in "Closed" status. Until that time documentation may not appear fixed.
[9 Mar 2006 13:31] Paul DuBois
Thank you for your bug report. This issue has been addressed in the
documentation. The updated documentation will appear on our website
shortly, and will be included in the next release of the relevant
product(s).

Additional info:

Off-by-one error in formula.  No "+1" needed.
[22 Mar 2006 6:29] Ben Birch
The change to the documentation was wrong, or at least very misleading. Off by one error is incorrect, not because the range of results returned is wrong, but because of the extreme bias away from certain values.  

I ran the adjusted function FLOOR(7 + (RAND() * 5))  half a billion times and failed to get a single result of 12. This is because, to produce a result of 12, RAND() would need to produce a perfect 1.0. According to the documentation this is possible, but I've yet to see it actually occur.

The function I provided earlier will however give accurate results, as using the MOD() function eliminates the oh-so-rare chance that you will get a result outside the range you want.

after selecting FLOOR(7 + (5 * RAND())) 500,000,000 times I got these results.
   7: 100003913
   8: 100022517
   9: 99997314
  10: 99986863
  11: 99989393
  12: 0

To summarize; to generate a number between i and j

FLOOR(i + (RAND() * (j - i + 1)))
will give an unbiased spread in the range requested, but can theoretically return j+1

FLOOR(i + (RAND() * (j - i)))
will give a strongly biased spread, because the chance of actually getting j is extremely low

FLOOR(i + (MOD(RAND(), 1) * (j - i +1))
will return all values in the spread with an extremely small bias to i

I understand the reasons behind not wanting to require such a convoluted function just to produce a random integer in a range, so maybe it would be better to bite the bullet and fix rand() so that it can not produce an exact 1.0 result.
[23 Mar 2006 10:08] Valeriy Kravchuk
Let me check again.
[21 Apr 2006 13:50] Valeriy Kravchuk
For me the idea to implement RAND() returning values 0 <= value < 1 sounds like a reasonable feature request. But I am not sure it will be implemented, because it, theoretically, can brake backward compatibility. So, let the developers decide what is the best way to solve the problem: to implement a new version of rand() or to give a more detailed description (like in the last comment) in the manual.
[26 Apr 2006 1:28] Ben Birch
I do not believe that it is possible to break any real world applications with this change. If fact, given that many developers have probably used the incorrect example in the documentation to generate integers, changing rand() such that 0 <= rand() < 1 will actually fix many incorrectly implemented solutions.
[26 Jan 2007 10:53] Georgi Kodinov
Looking at the current mysql-5.0.36-BK it is evident that the RAND() function cannot possibly return 1 : basically its value is calculated like this :

result = (f(<seed>, <previous_value>) % <constant>) / <constant>

This is guaranteed never to return 1 because <constant> is the same number.

So the documentation needs to be updated accordingly.
[26 Jan 2007 16:30] Paul DuBois
Changing category to Server:Documentation because
this has been determined to be a docs issue.
[26 Jan 2007 16:30] Paul DuBois
Changing category to Server:Documentation because
this has been determined to be a docs issue.
[26 Jan 2007 16:32] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

Changed the description for RAND() so that the upper
bound on the rand is non-inclusive (that is, it is <, not
<=).