Bug #33232 Documentation/changelog note of RAND() interaction with DISTINCT changing
Submitted: 14 Dec 2007 0:55 Modified: 8 Jan 2008 16:48
Reporter: Trent Lloyd Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.0.6 OS:Any
Assigned to: Konstantin Osipov CPU Architecture:Any

[14 Dec 2007 0:55] Trent Lloyd
Description:
The release of 5.0.6 changed the behavior of RAND()'s interation with DISTINCT.  This happened between 5.0.4 and 5.0.6

In old case (<5.0.6) it would calculate RAND() after the DISTINCT was applied and return 1 row, now it can return multiple rows.  This behavior exists in 4.0.

It is interesting to note that if you drop the primary key on test_b, then 
the 4.0.26 output matches the 5.0.44 output (in terms of row count, I 
mean, obviously the random numbers are different). Dropping the primary 
key on test_a has no effect.

This I suspect this was a bug fix, but it would be nice if there was some kind of mention of the bug or issue in the docs.  I have attempted to track down exactly what bug/fix from 5.0.4 to 5.0.6 caused this but was not successful.

How to repeat:
Simple Test Case:
CREATE TABLE test_a (
id int(10) unsigned NOT NULL default '0',
PRIMARY KEY (id)
) TYPE=MyISAM;

INSERT INTO test_a VALUES (1);
INSERT INTO test_a VALUES (2);
INSERT INTO test_a VALUES (3);

CREATE TABLE test_b (
id int(10) unsigned NOT NULL default '0',
secondary_id int(10) unsigned NOT NULL default '0',
PRIMARY KEY (id,secondary_id)
) TYPE=MyISAM;

INSERT INTO test_b VALUES (1,1);
INSERT INTO test_b VALUES (2,2);
INSERT INTO test_b VALUES (2,200);
INSERT INTO test_b VALUES (2,205);
INSERT INTO test_b VALUES (3,3);

SELECT DISTINCT test_a.id
FROM test_a, test_b WHERE test_a.id = test_b.id;

SELECT DISTINCT test_a.id, RAND()
FROM test_a, test_b WHERE test_a.id = test_b.id;

Expected Results:
MySQL version '4.0.26-max-log' produces the following output:

+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
+----+
3 rows in set (0.00 sec)

+----+-------------------+
| id | RAND() |
+----+-------------------+
| 1 | 0.331300282228086 |
| 2 | 0.632497280493842 |
| 3 | 0.168585586518595 |
+----+-------------------+
3 rows in set (0.00 sec)

MySQL version '5.0.44sp1-enterprise-gpl' produces this output:

+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
+----+
3 rows in set (0.00 sec)

+----+-------------------+
| id | RAND() |
+----+-------------------+
| 1 | 0.611777865897657 |
| 2 | 0.605727062193423 |
| 2 | 0.193301744007786 |
| 2 | 0.14932756419231 |
| 3 | 0.166732619671833 |
+----+-------------------+
5 rows in set (0.00 sec)

Suggested fix:
Document this change/bug more clearly.