Bug #28737 Falcon can not use index for sorting
Submitted: 29 May 2007 8:24 Modified: 26 May 2010 17:49
Reporter: Yuan WANG Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S4 (Feature request)
Version: 6.0.1-alpha-log OS:Linux
Assigned to: CPU Architecture:Any
Tags: F_NEW FEATURE

[29 May 2007 8:24] Yuan WANG
Description:
I have done a simple test and found Falcon can not use index for sorting.

How to repeat:
mysql> create table t(a int primary key, b int, c int) engine = falcon;
mysql> create index idx_t_ba on t(b, a);
mysql> insert into t values(1, 1, 1);
mysql> insert into t values(2, 1, 1);
mysql> explain select * from t where b = 1 order by a desc;
+----+-------------+-------+------+---------------+----------+---------+-------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+----------+---------+-------+------+-----------------------------+
| 1 | SIMPLE | t | ref | idx_t_ba | idx_t_ba | 5 | const | 10 | Using where; Using filesort |
+----+-------------+-------+------+---------------+----------+---------+-------+------+-----------------------------+
mysql> alter table t engine = innodb;
mysql> explain select * from t where b = 1 order by a desc;
+----+-------------+-------+------+---------------+----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+----------+---------+-------+------+-------------+
| 1 | SIMPLE | t | ref | idx_t_ba | idx_t_ba | 5 | const | 1 | Using where |
+----+-------------+-------+------+---------------+----------+---------+-------+------+-------------+

So you can see, InnoDB can use index for sorting but Falcon can't. In practice this should be a major problem, I think.
[29 May 2007 12:12] Hakan Küçükyılmaz
Verified as described.
[29 May 2007 13:07] Hakan Küçükyılmaz
It looks like that Falcon is using the index for sorting and the EXPLAIN output is wrong.

SET @@autocommit = 0;

DELIMITER //
CREATE PROCEDURE p1()
BEGIN
  SET @v = 1;

  WHILE @v < 500 * 1024 DO
    IF mod(@v, 1000) = 0 THEN
      SET @v = @v + 1;
      INSERT INTO t VALUES (@v, 2, @v);
      COMMIT;
      SELECT @v;
    END IF;

    SET @v = @v + 1;
    INSERT INTO t VALUES (@v, 1, @v);
  END WHILE;
END;
//

DELIMITER ;
CALL p1;
COMMIT;

SELECT * FROM t WHERE b = 1 ORDER BY a DESC;
|      3 |    1 |      3 |
|      2 |    1 |      2 |
+--------+------+--------+
511488 rows in set (3.53 sec)

ALTER TABLE t Engine InnoDB;
SELECT * FROM t WHERE b = 1 ORDER BY a DESC;
|      3 |    1 |      3 |
|      2 |    1 |      2 |
+--------+------+--------+
511488 rows in set (5.74 sec)
[29 May 2007 15:06] Ann Harrison
At the moment Falcon never uses an index to avoid a sort.

Generally retrieving records in natural order and sorting them
is faster than retrieving them in index order.  There are
exceptions.  

One exception is when the index order is the storage order
as is the case for InnoDB primary keys.  Because of space
reutilization in Falcon, even when records are stored in
index order (e.g. autonumber fields) their placement is not
guaranteed to be ordered.

The other exception is when the query would normally return
a large number of rows (e.g. no where clause), but contains
a limit clause.  In that case, retrieving records in indexed
order is a good procedure, however, the server doesn't make
that information available to storage engines.
[30 May 2007 3:17] Yuan WANG
So this is not a bug but rather a choice of design or an issue between MySQL server and storage engines? However, I think this would be problematic in real applications if be can not use index to retrieve rows in its order. Retrieving top N rows is quite common. For example, you will see top N blogs, top N emails, top N forum posts in descending order of time, or top N blogs, top N forum posts in descending order of popularity.  These usages can be identified by the following query pattern.
  SELECT ... FROM t WHERE ... ORDER BY ... LIMIT N OFFSET M

There are more complicate situations that there are no LIMIT clause for each table, but LIMIT for the whole query. For example, we can store meta data and content of articles in two tables, and use a join to retrieve top N articles. 
  SELECT ... FROM article, article_content where article.user_id = 1 and article.id = article_content.article_id order by article.publish_time desc limit 10;

The common technique for this example is to create an index on article(user_id, publish_time). However, if we can not use that index to retrieve rows in descending order of publish time, we will have to retrieve all articles and sort, which is slow.

So I think Falcon should provide the ability to retrieve rows in its order, at least as a choice.
[19 Jun 2007 23:43] Calvin Sun
Downgrade to P3 since it is a feature request. We will re-evaluate the design after beta1.
[24 May 2009 3:04] Yuan WANG
In current version of Falcon. Falcon will use index for sorting if a limit is given. So I think this bug has been fixed?