Description:
This is a bug on the optimizer when using a function in a select query.
I have a table with 10+Million records.
In order to faster the queries based on a date range we store the first id on any given day. Thus, when querying on a datarange we have our boundaries from the primary key id.
Since we moved to Mysql 5.1.49, the optimizer is not using the primary key index when we use a function for the id range. It only uses it when we specifically use a value for the id.
Here's the explain on both queries.
mysql> desc select camp, count(*) from table1 where id >=date2table1id(20110413) group by camp having count(*) > 20 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: table1
type: index
possible_keys: NULL
key: INDEX_CAMP_OFFER
key_len: 10
ref: NULL
rows: 13529455
Extra: Using where
1 row in set (0.00 sec)
mysql> select date2table1id(20110413);
+-----------------------+
| date2leadid(20110413) |
+-----------------------+
| 93412229 |
+-----------------------+
1 row in set (0.00 sec)
mysql> desc select camp, count(*) from table1 where id >=93412229 group by camp having count(*) > 20 \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: table1
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
rows: 84864
Extra: Using where; Using temporary; Using filesort
show create function date2leadid \G
*************************** 1. row ***************************
Function: date2table1id
sql_mode:
Create Function: CREATE DEFINER=`mysqluser`@`%` FUNCTION `date2table1id`(d date) RETURNS int(11)
begin
declare id int;
select table1_id into id from table1_date_index where index_date=d;
return id;
end
character_set_client: latin1
collation_connection: latin1_swedish_ci
Database Collation: latin1_swedish_ci
As you can see the second one is using the proper index parsing 80000+ rows instead of millions.
How to repeat:
CREATE TABLE table1_date_index (index_date date NOT NULL, table1_id int(11) NOT NULL, PRIMARY KEY (`index_date`));
CREATE FUNCTION `date2table1id`(d date) RETURNS int(11)
begin
declare id int;
select table1_id into id from table1_date_index where index_date=d;
return id;
end
create table table1(id int primary key, camp int);
insert into table1_date_index (index_date, table1_id) values ('2011-04-13', 1);
insert into table1 (camp) values (1);
insert into table1 (camp) values (2);
desc select camp, count(*) from table1 where id >=0 group by camp;
desc select camp, count(*) from table1 where id >=date2table1id(20110413);
Suggested fix:
Mysql should operate as in previous version and use the primary index even when the result comes from a function.