Bug #59253 implementation of order by rand()
Submitted: 2 Jan 2011 21:16 Modified: 2 May 2012 18:40
Reporter: Harald Reindl Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: General Severity:S4 (Feature request)
Version:5.5.8 OS:Any
Assigned to: CPU Architecture:Any

[2 Jan 2011 21:16] Harald Reindl
Description:
ok, i know that random is not for what a database is designed
but why in the world is a test-table with > 7.000.000 rows where should be fetched 10.000 totally copied in a temp table if there is a primary key instead of fetching 10.000 primary-keys by random and after that the data?
_______________

mysql> select * from mytest order by rand() limit 10000;
Copying to tmp table | select * from mytest order by rand() limit 10000

-rw-rw---- 1 mysql mysql 3,6G 2011-01-02 22:03 #sql_5ae1_0.MYD
-rw-rw---- 1 mysql mysql 1,0K 2011-01-02 22:02 #sql_5ae1_0.MYI

10000 rows in set (3 min 30.92 sec)
10000 rows in set (3 min 29.45 sec)
_______________

CREATE TABLE `mytest` (
  `auto_key` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `test_data` longtext COLLATE latin1_german1_ci NOT NULL,
  PRIMARY KEY (`auto_key`)
) ENGINE=MyISAM AUTO_INCREMENT=7044705 DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci PACK_KEYS=1 CHECKSUM=1 DELAY_KEY_WRITE=1;

How to repeat:
* Create a table with primary-key and data-fields
* fill in really hughe records
* fetch 10.000 records

Suggested fix:
Look if there is a unique-key, fetch random keys as requested and the data after that instead copy some gb into a temp-table
[2 Jan 2011 21:30] Harald Reindl
Better sample to show how weak rand() is implemented

* Updated all records to 'A', so we are  on 200 MB table size
* fetching 5 random records also make a temp-file with all data

select * from mytest order by rand() limit 5;
5 rows in set (40.42 sec)
5 rows in set (14.68 sec)
5 rows in set (14.25 sec)

select * from mytest order by rand() limit 2;
2 rows in set (16.56 sec)
2 rows in set (15.71 sec)
2 rows in set (15.79 sec)

|  1 | SIMPLE      | mytest | ALL  | NULL          | NULL | NULL    | NULL | 7044704 | Using temporary; Using filesort |
[12 Jan 2011 3:42] Alexey Kishkin
as far as I understand it's not a bug but rather feature request
[12 Jan 2011 4:31] Harald Reindl
i agree :-)

in my applications i fetch since years the pri-keys in a array and do the random in php since i noticed that display a random shop-item is running two seconds instead 0.012 seconds whole generate-times

but how many others do not realize that as long tables have only few records and the load is low so nobody notices the temp-tables by luck :-)
[2 May 2012 18:40] Sveta Smirnova
There is verified bug #65177 about very same problem. So closing this one as duplicate of that.