Bug #1517 Feature Request Nearest match
Submitted: 10 Oct 2003 7:24 Modified: 26 Nov 2005 13:15
Reporter: Boyd Gerber Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S4 (Feature request)
Version: OS:Sun
Assigned to: CPU Architecture:Any

[10 Oct 2003 7:24] Boyd Gerber
Description:
This is a feature request from Anthony Ball Anthony.Ball@ftid.com

Something that would make life much easier would be a way to select the nearest match above or below a requested value in a db.

For instance, I work with financial data, we have factor histories which contain iformation on securities. A table structure may look like:

CREATE TABLE factors (
cusip char(9) not null,
date char(8) not null,
factor char(11),
primary key (cusip,date)
)

I often need to get the most recent factor from a table like this for a list of cusips (that come from another table) based on a date. So I will need the most recent records for a list of cusips on or before 2002-10-31, for instance. Right now the only way I have been able to do this is with a temporary table. It seems that with a B-Tree it would be rather trivial to track down the nearest neighbor in one direction or the other if no direct match was found. I am not a database programmer, however, so I don't know :)

That is just my two cents... or if there is a better way to do it than temporary tables I would like to know it... 

How to repeat:
Feature Request
[26 Nov 2005 13:15] Valeriy Kravchuk
Thank you for a feature request. Sorry, but I think the task can be solved using existing SQL features in MySQL:

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.0.15-nt |
+-----------+
1 row in set (0.02 sec)

mysql> create table factors (
    ->   cusip char(9) not null,
    ->   cdate date not null,    -- use correct data type for dates!
    ->   factor char(11),
    -> primary key (cusip, cdate));
Query OK, 0 rows affected (0.06 sec)

mysql> insert into factors values ('1', '2005-11-10', 'factor a');
Query OK, 1 row affected (0.06 sec)

mysql> insert into factors values ('1', '2005-11-13', 'factor b');
Query OK, 1 row affected (0.03 sec)

mysql> insert into factors values ('1', '2005-11-15', 'factor c');
Query OK, 1 row affected (0.03 sec)

mysql> select * from factors;
+-------+------------+----------+
| cusip | cdate      | factor   |
+-------+------------+----------+
| 1     | 2005-11-10 | factor a |
| 1     | 2005-11-13 | factor b |
| 1     | 2005-11-15 | factor c |
+-------+------------+----------+
3 rows in set (0.03 sec)

Here is how 1 row for the date or the closes date can be retrieved:

mysql> select * from factors where cdate <='2005-11-14' order by cdate desc limit 1;
+-------+------------+----------+
| cusip | cdate      | factor   |
+-------+------------+----------+
| 1     | 2005-11-13 | factor b |
+-------+------------+----------+
1 row in set (0.00 sec)

Yes, it may be inefficient:

mysql> explain select * from factors where cdate <='2005-11-14' order by cdate desc limit 1;
+----+-------------+---------+------+---------------+------+---------+------+---
---+-----------------------------+
| id | select_type | table   | type | possible_keys | key  | key_len | ref  | ro
ws | Extra                       |
+----+-------------+---------+------+---------------+------+---------+------+---
---+-----------------------------+
|  1 | SIMPLE      | factors | ALL  | NULL          | NULL | NULL    | NULL |
 3 | Using where; Using filesort |
+----+-------------+---------+------+---------------+------+---------+------+---
---+-----------------------------+
1 row in set (0.03 sec)

But this problem can be solved by adding another index:

mysql> alter table factors add key(cdate);
Query OK, 3 rows affected (0.29 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> explain select * from factors where cdate <='2005-11-14' order by cdate desc limit 1;
+----+-------------+---------+-------+---------------+-------+---------+------+-
-----+-------------+
| id | select_type | table   | type  | possible_keys | key   | key_len | ref  |
rows | Extra       |
+----+-------------+---------+-------+---------------+-------+---------+------+-
-----+-------------+
|  1 | SIMPLE      | factors | range | cdate         | cdate | 3       | NULL |
   1 | Using where |
+----+-------------+---------+-------+---------------+-------+---------+------+-
-----+-------------+
1 row in set (0.02 sec)

So, I see no need for any new features here.