Description:
I have tables with the following details
create table gridimages (
ID INT UNSIGNED AUTO_INCREMENT NOT NULL,
GenDate DATETIME NOT NULL,
ForDate DATETIME,
CrDate DATETIME NOT NULL,
AInterval INT UNSIGNED NOT NULL,
Projn INT UNSIGNED NOT NULL,
DataType CHAR(8) NOT NULL,
ErrorType CHAR(1) NOT NULL,
Tag VARCHAR(40) NOT NULL,
Data LONGTEXT NOT NULL,
Resln INT UNSIGNED NOT NULL,
Cappi INT UNSIGNED NOT NULL,
Clutter CHAR(1) NOT NULL,
Occult CHAR(1) NOT NULL,
Atten CHAR(1) NOT NULL,
Hail CHAR(1) NOT NULL,
CalFactor DOUBLE,
User VARCHAR(10) NOT NULL,
primary key (ID),
index gendix (GenDate),
index fordix (ForDate),
index crdix (CrDate),
index typeix (DataType, ErrorType),
index tagix (Tag)
) type = InnoDB;
create table projns (
ID INT UNSIGNED AUTO_INCREMENT NOT NULL,
Hash INT UNSIGNED NOT NULL,
StatnId INT UNSIGNED NOT NULL,
Parameters LONGTEXT NOT NULL,
primary key (ID),
index hashix (Hash)
) type = InnoDB;
The following query uses the wrong index.
select * from gridimages,projns where GenDate>='2003-05-01 00:00:00'
and GenDate<'2003-05-01 01:00:00' and gridimages.Projn=projns.ID and
projns.StatnId=54 and DataType='rainfall' and ErrorType='N' and Tag=''
order by GenDate
Explain reports
+------------+------+---------------------+--------+---------+
| table | type | possible_keys | key | key_len |
+------------+------+---------------------+--------+---------+
| gridimages | ref | gendix,typeix,tagix | typeix | 9 |
| projns | ALL | PRIMARY | NULL | NULL |
+------------+------+---------------------+--------+---------+
+-------------+------+----------------------------------------------+
| ref | rows | Extra
+-------------+------+----------------------------------------------+
| const,const | 90 | Using where; Using temporary; Using filesort |
| NULL | 2 | Using where |
+-------------+------+----------------------------------------------+
The DataType field only contains two values: 'rainfall' and
'rawrefl'. There are equal numbers of records using each type.
The ErrorType field is always 'N' and the Tag field is always ''.
The GenDate field contains a distinct date for each pair of rainfall and
rawrefl records. There is only one relevant record in the projns table.
If I force the use of the date index:
select * from gridimages use index (gendix),projns where GenDate>='2003-05-01 00:00:00'
and GenDate<'2003-05-01 01:00:00' and gridimages.Projn=projns.ID and
projns.StatnId=54 and DataType='rainfall' and ErrorType='N' and Tag=''
order by GenDate
then explain reports:
+------------+-------+---------------+--------+---------+
| table | type | possible_keys | key | key_len |
+------------+-------+---------------+--------+---------+
| projns | ALL | PRIMARY | NULL | NULL |
| gridimages | range | gendix | gendix | 8 |
+------------+-------+---------------+--------+---------+
+------+------+----------------------------------------------+
| ref | rows | Extra
+------+------+----------------------------------------------+
| NULL | 2 | Using where; Using temporary; Using filesort |
| NULL | 11 | Using where
+------+------+----------------------------------------------+
When there are 20000 records, using typeix will result in scanning 10000
records whereas using gendix will result in scanning only a few tens
of records.
If I remove the datatype from the query:
select * from gridimages,projns where GenDate>='2003-05-01 00:00:00'
and GenDate<'2003-05-01 01:00:00' and gridimages.Projn=projns.ID and
projns.StatnId=54 and Tag=''
order by GenDate
+------------+------+---------------+-------+---------+
| table | type | possible_keys | key | key_len |
+------------+------+---------------+-------+---------+
| gridimages | ref | gendix,tagix | tagix | 40 |
| projns | ALL | PRIMARY | NULL | NULL |
+------------+------+---------------+-------+---------+
+-------+------+----------------------------------------------+
| ref | rows | Extra
+-------+------+----------------------------------------------+
| const | 90 | Using where; Using temporary; Using filesort |
| NULL | 2 | Using where |
+-------+------+----------------------------------------------+
The optimiser avoids using the gendix index although it is best.
tagix is the worst possible choice.
How to repeat:
See above
Suggested fix:
Perhaps what is missing is the ability to analyse the tables like with myisam.