Bug #54312 insert ... select .. order by rand(): difference depending on indexes.
Submitted: 7 Jun 2010 22:19 Modified: 8 Jun 2010 4:50
Reporter: Some One Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.47, 5.1.48-bzr OS:Any
Assigned to: CPU Architecture:Any

[7 Jun 2010 22:19] Some One
Description:
If you run a query to copy rows from one table to another, shuffling them on the fly, the result is different depending on the existence of indexes.

i used these 2 queries for replicating the table and copy shuffled data:

create table table1_copy like table1;
insert table1_copy select * from table1 order by rand();

If table1 has no index, the query will work as expected. Otherwise, the query will simply copy the table1 as is, without mixing the data.

Is this a bug?

How to repeat:
 /usr/local/mysql/bin/mysql -p -uroot
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 20
Server version: 5.1.47 MySQL Community Server (GPL)

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> create table test1 (id int);
Query OK, 0 rows affected (0.00 sec)

mysql> insert test1 values(1),(2),(3),(4),(5);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> create table test1_copy like test1;
Query OK, 0 rows affected (0.00 sec)

mysql> insert test1_copy select * from test1 order by rand();
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> select * from test1_copy;
+------+
| id   |
+------+
|    5 |
|    1 |
|    4 |
|    2 |
|    3 |
+------+
5 rows in set (0.00 sec)

mysql> create table test2 (id int auto_increment,primary key(id));
Query OK, 0 rows affected (0.00 sec)

mysql> insert test2 values(1),(2),(3),(4),(5);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> create table test2_copy like test2;
Query OK, 0 rows affected (0.00 sec)

mysql> insert test2_copy select * from test2 order by rand();
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> select * from test2_copy;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
|  4 |
|  5 |
+----+
5 rows in set (0.01 sec)

mysql>
[8 Jun 2010 4:50] Valeriy Kravchuk
It is easy to repeat on 5.1.48 as well. The difference is explained by the following results to some extent:

mysql> explain select * from test2 order by rand();
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                                        |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------------------------------------------+
|  1 | SIMPLE      | test2 | index | NULL          | PRIMARY | 4       | NULL |    5 | Using index; Using temporary; Using filesort |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------------------------------------------+
1 row in set (0.03 sec)

mysql> explain select * from test1 order by rand();
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                           |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
|  1 | SIMPLE      | test1 | ALL  | NULL          | NULL | NULL    | NULL |    5 | Using temporary; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
1 row in set (0.00 sec)

That is, when you add index it is used to scan all rows instead of the table itself. But as filesort is applied in both cases I'd expect for ORDER BY rand() to still work. So, I'd call this a bug.