Bug #34467 MyISAM and Heap tables should provide index stats without running ANALYZE TABLE
Submitted: 11 Feb 2008 16:32 Modified: 24 Oct 2008 22:02
Reporter: Mark Callaghan Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S4 (Feature request)
Version:5.0.37, 4.1, 5.0, 5.1, 6.0 bzr OS:Any
Assigned to: CPU Architecture:Any
Tags: analyze, indexes, show, statistics

[11 Feb 2008 16:32] Mark Callaghan
Description:
InnoDB collects index cardinality stats for the optimizer without having to run ANALYZE TABLE. MyISAM and Heap do not. They should collect stats without running ANALYZE, either full or sampled, as they are used for temp tables and users frequently forget to run this. Without stats, join orders and query plans are inconsistent and slow.

Note that MyISAM and Heap appear to do some form of sampling for handler::records_in_range. But from the output of SHOW INDEXES, it is clear they don't do the same for index cardinality stats.

How to repeat:
drop table if exists n, st, st2, it, it2;

create table n(i int);
insert into n values (0), (1), (2), (3), (4), (5), (6), (7), (8);
insert into n select * from n;
insert into n select * from n;
insert into n select * from n;
insert into n select * from n;

create temporary table tt (i int, p int not null auto_increment, index(i,p), primary key (p));
create table st (i int, p int not null auto_increment, index(i,p), primary key (p));
create table it (i int, p int not null auto_increment, index(i,p), primary key (p)) engine=innodb;

insert into tt select i, null from n;
insert into st select i, null from n;
insert into it select i, null from n;

-- Cardinality is NULL for non-PK columns on temp table
show indexes from tt
--------------
Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation       Cardinality     Sub_part        Packed  Null    Index_type      Comment
tt      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
tt      1       i       1       i       A       NULL    NULL    NULL    YES     BTREE
tt      1       i       2       p       A       NULL    NULL    NULL            BTREE
--------------

-- Cardinality values are NULL for non-PK columns on MyISAM table
show indexes from st
--------------

Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation       Cardinality     Sub_part        Packed  Null    Index_type      Comment
st      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
st      1       i       1       i       A       NULL    NULL    NULL    YES     BTREE
st      1       i       2       p       A       NULL    NULL    NULL            BTREE
--------------

-- Cardinality values are correct for non-PK columns on InnoDB table
show indexes from it
--------------

Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation       Cardinality     Sub_part        Packed  Null    Index_type      Comment
it      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
it      1       i       1       i       A       144     NULL    NULL    YES     BTREE
it      1       i       2       p       A       144     NULL    NULL            BTREE

Suggested fix:
Extend whatever sampling is done for handler::records_in_range to make index cardinality stats valid.
[12 Feb 2008 9:15] Sveta Smirnova
Thank you for the report.

I can not repeat described behavior:

=====mysql-5.0.37-linux-i686=====
=====bug34467=====
drop table if exists n, st, st2, it, it2;
Warnings:
Note    1051    Unknown table 'n'
Note    1051    Unknown table 'st'
Note    1051    Unknown table 'st2'
Note    1051    Unknown table 'it'
Note    1051    Unknown table 'it2'
create table n(i int);
insert into n values (0), (1), (2), (3), (4), (5), (6), (7), (8);
insert into n select * from n;
insert into n select * from n;
insert into n select * from n;
insert into n select * from n;
create temporary table tt (i int, p int not null auto_increment, index(i,p), primary key
(p));
create table st (i int, p int not null auto_increment, index(i,p), primary key (p));
create table it (i int, p int not null auto_increment, index(i,p), primary key (p))
engine=innodb;
insert into tt select i, null from n;
insert into st select i, null from n;
insert into it select i, null from n;
show indexes from tt;
Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation       Cardinality     Sub_part        Packed  Null    Index_type      Comment
tt      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
tt      1       i       1       i       A       9       NULL    NULL    YES     BTREE
tt      1       i       2       p       A       144     NULL    NULL            BTREE
show indexes from st;
Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation       Cardinality     Sub_part        Packed  Null    Index_type      Comment
st      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
st      1       i       1       i       A       9       NULL    NULL    YES     BTREE
st      1       i       2       p       A       144     NULL    NULL            BTREE
show indexes from it;
Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation       Cardinality     Sub_part        Packed  Null    Index_type      Comment
it      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
it      1       i       1       i       A       144     NULL    NULL    YES     BTREE
it      1       i       2       p       A       144     NULL    NULL            BTREE
[12 Feb 2008 11:13] Sergei Golubchik
nevertheless, myisam indeed only collects cardinality data on ANALYZE
[24 Oct 2008 21:51] Mark Callaghan
Sveta -- if you use myisam_max_sort_file_size=0 then you will reproduce this. Index stats are collected when repair by sort is used. They are not collected when repair by keycache is used.

I am very confused by this. I don't think an on-disk temp table is used for this and SHOW STATUS like '%tmp%' shows that no on-disk temp tables are used during the test. Yet MyISAM code is used. Is that because this is a b-tree heap table?
[24 Oct 2008 22:02] Mark Callaghan
You should have a test that confirms that index stats are available and set to reasonable values for myisam and temp tables.
[27 Oct 2008 8:17] Sveta Smirnova
Mark, thank you for the feedback.

Verified as described with all major versions.
[16 Dec 2008 18:59] Sveta Smirnova
Bug #40983 was marked as duplicate of this one.
[7 Nov 2010 22:50] Maxim Deviatov
Patch for MEMORY storage engine is available at

http://lists.mysql.com/internals/38181
[8 Nov 2010 13:10] James Day
Partial workaround: set myisam_max_sort_file_size to the amount of free disk space on the server so that repair by sort, which does currently generate statistics, is used, instead of repair with keycache, which doesn't. This is already best practice because the repair by sort method is much faster than the repair with keycache.
[26 Nov 2010 13:35] Sergey Vojtovich
For further progress on MEMORY engine see BUG#58520 - "Memory tables should provide cardinality for btree indexes".