Description:
Hi,
Let's create two tables :
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a char(2), b int(2), PRIMARY KEY (a,b));
INSERT INTO t1 VALUES ('a',1),('a',2),('a',3);
DROP TABLE IF EXISTS t2;
CREATE TABLE t2 (a int(2), b int(2), PRIMARY KEY (b));
INSERT INTO t2 VALUES (1,1),(1,2),(2,3);
Now I want to retrieve all the t1.b field where t1.a matches 'a', and the t2.a associated.
SELECT t2.a,t1.b FROM t2,t1 WHERE t2.b=t1.b AND t1.a='a' ORDER BY t1.b DESC;
+------+---+
| a | b |
+------+---+
| 2 | 3 |
| 1 | 2 |
| 1 | 1 |
+------+---+
Since t1 has an index on (a,b), the optimizer only needs the index file of t1, and no filesorting is needed since it retrieves the rows in the reverse order of the index.
EXPLAIN confirms this fact :
EXPLAIN SELECT t2.a,t1.b FROM t2,t1 WHERE t2.b=t1.b AND t1.a='a' ORDER BY t1.b DESC;
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+--------------------------+
| 1 | SIMPLE | t1 | ref | PRIMARY | PRIMARY | 2 | const | 2 | Using where; Using index |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | test.t1.b | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+--------------------------+
2 rows in set (0.00 sec)
Let's say now I still want the second column in the reverse index order, but I don't want any duplicate in the first column of this query.
SELECT t2.a,MAX(t1.b) FROM t2,t1 WHERE t2.b=t1.b AND t1.a='a' GROUP BY t2.a ORDER BY t1.b DESC;
+------+-----------+
| a | MAX(t1.b) |
+------+-----------+
| 2 | 3 |
| 1 | 2 |
+------+-----------+
2 rows in set (0.00 sec)
this returns what I want, and shouldn't use any file sorting since results in the second column are still returned in the reverse index order. Unfortunately in this case, the GROUP BY order seems to take precedence over the ORDER BY internally and thus a filesort is needed. (I assume MySQL fetches all the rows, then do the GROUP BY in a temp table, and at the end do the ORDER BY using filesort).
EXPLAIN SELECT t2.a,MAX(t1.b) FROM t2,t1 WHERE t2.b=t1.b AND t1.a='a' GROUP BY t2.a ORDER BY t1.b DESC;
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------------------------------------+
| 1 | SIMPLE | t1 | ref | PRIMARY | PRIMARY | 2 | const | 2 | Using where; Using index; Using temporary; Using filesort |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | test.t1.b | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-----------------------------------------------------------+
2 rows in set (0.01 sec)
Is it possible to have a little bit smarter behaviour of MySQL in such case, and detect since we want MAX(t1.b) ... ORDER BY t1.b DESC, no filesorting is needed ? (and same thing for MIN(t1.b) ... ORDER BY t1.b ASC) (I assume it would speed up things a lot, especially when combined with a LIMIT clause).
Thanks !
Jocelyn
How to repeat:
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a char(2), b int(2), PRIMARY KEY (a,b));
INSERT INTO t1 VALUES ('a',1),('a',2),('a',3);
DROP TABLE IF EXISTS t2;
CREATE TABLE t2 (a int(2), b int(2), PRIMARY KEY (b));
INSERT INTO t2 VALUES (1,1),(1,2),(2,3);
SELECT t2.a,t1.b FROM t2,t1 WHERE t2.b=t1.b AND t1.a='a' ORDER BY t1.b DESC;
EXPLAIN SELECT t2.a,t1.b FROM t2,t1 WHERE t2.b=t1.b AND t1.a='a' ORDER BY t1.b DESC;
SELECT t2.a,MAX(t1.b) FROM t2,t1 WHERE t2.b=t1.b AND t1.a='a' GROUP BY t2.a ORDER BY t1.b DESC;
EXPLAIN SELECT t2.a,MAX(t1.b) FROM t2,t1 WHERE t2.b=t1.b AND t1.a='a' GROUP BY t2.a ORDER BY t1.b DESC;