Bug #34467 MyISAM and Heap tables should provide index stats without running ANALYZE TABLE
Submitted: 11 Feb 2008 17:32 Modified: 25 Oct 2008 0:02
Reporter: Mark Callaghan
Status: Verified
Category:Server: MyISAM Severity:S4 (Feature request)
Version:5.0.37, 4.1, 5.0, 5.1, 6.0 bzr OS:Any
Assigned to: Target Version:
Tags: statistics, indexes, show, analyze
Triage: Triaged: D5 (Feature request)

[11 Feb 2008 17:32] Mark Callaghan
Description:
InnoDB collects index cardinality stats for the optimizer without having to run ANALYZE
TABLE. MyISAM and Heap do not. They should collect stats without running ANALYZE, either
full or sampled, as they are used for temp tables and users frequently forget to run
this. Without stats, join orders and query plans are inconsistent and slow.

Note that MyISAM and Heap appear to do some form of sampling for
handler::records_in_range. But from the output of SHOW INDEXES, it is clear they don't do
the same for index cardinality stats.

How to repeat:
drop table if exists n, st, st2, it, it2;

create table n(i int);
insert into n values (0), (1), (2), (3), (4), (5), (6), (7), (8);
insert into n select * from n;
insert into n select * from n;
insert into n select * from n;
insert into n select * from n;

create temporary table tt (i int, p int not null auto_increment, index(i,p), primary key
(p));
create table st (i int, p int not null auto_increment, index(i,p), primary key (p));
create table it (i int, p int not null auto_increment, index(i,p), primary key (p))
engine=innodb;

insert into tt select i, null from n;
insert into st select i, null from n;
insert into it select i, null from n;

-- Cardinality is NULL for non-PK columns on temp table
show indexes from tt
--------------
Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation      
Cardinality     Sub_part        Packed  Null    Index_type      Comment
tt      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
tt      1       i       1       i       A       NULL    NULL    NULL    YES     BTREE
tt      1       i       2       p       A       NULL    NULL    NULL            BTREE
--------------

-- Cardinality values are NULL for non-PK columns on MyISAM table
show indexes from st
--------------

Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation      
Cardinality     Sub_part        Packed  Null    Index_type      Comment
st      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
st      1       i       1       i       A       NULL    NULL    NULL    YES     BTREE
st      1       i       2       p       A       NULL    NULL    NULL            BTREE
--------------

-- Cardinality values are correct for non-PK columns on InnoDB table
show indexes from it
--------------

Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation      
Cardinality     Sub_part        Packed  Null    Index_type      Comment
it      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
it      1       i       1       i       A       144     NULL    NULL    YES     BTREE
it      1       i       2       p       A       144     NULL    NULL            BTREE

Suggested fix:
Extend whatever sampling is done for handler::records_in_range to make index cardinality
stats valid.
[12 Feb 2008 10:15] Sveta Smirnova
Thank you for the report.

I can not repeat described behavior:

=====mysql-5.0.37-linux-i686=====
=====bug34467=====
drop table if exists n, st, st2, it, it2;
Warnings:
Note    1051    Unknown table 'n'
Note    1051    Unknown table 'st'
Note    1051    Unknown table 'st2'
Note    1051    Unknown table 'it'
Note    1051    Unknown table 'it2'
create table n(i int);
insert into n values (0), (1), (2), (3), (4), (5), (6), (7), (8);
insert into n select * from n;
insert into n select * from n;
insert into n select * from n;
insert into n select * from n;
create temporary table tt (i int, p int not null auto_increment, index(i,p), primary key
(p));
create table st (i int, p int not null auto_increment, index(i,p), primary key (p));
create table it (i int, p int not null auto_increment, index(i,p), primary key (p))
engine=innodb;
insert into tt select i, null from n;
insert into st select i, null from n;
insert into it select i, null from n;
show indexes from tt;
Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation      
Cardinality     Sub_part        Packed  Null    Index_type      Comment
tt      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
tt      1       i       1       i       A       9       NULL    NULL    YES     BTREE
tt      1       i       2       p       A       144     NULL    NULL            BTREE
show indexes from st;
Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation      
Cardinality     Sub_part        Packed  Null    Index_type      Comment
st      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
st      1       i       1       i       A       9       NULL    NULL    YES     BTREE
st      1       i       2       p       A       144     NULL    NULL            BTREE
show indexes from it;
Table   Non_unique      Key_name        Seq_in_index    Column_name     Collation      
Cardinality     Sub_part        Packed  Null    Index_type      Comment
it      0       PRIMARY 1       p       A       144     NULL    NULL            BTREE
it      1       i       1       i       A       144     NULL    NULL    YES     BTREE
it      1       i       2       p       A       144     NULL    NULL            BTREE
[12 Feb 2008 12:13] Sergei Golubchik
nevertheless, myisam indeed only collects cardinality data on ANALYZE
[24 Oct 2008 23:51] Mark Callaghan
Sveta -- if you use myisam_max_sort_file_size=0 then you will reproduce this. Index stats
are collected when repair by sort is used. They are not collected when repair by keycache
is used.

I am very confused by this. I don't think an on-disk temp table is used for this and SHOW
STATUS like '%tmp%' shows that no on-disk temp tables are used during the test. Yet MyISAM
code is used. Is that because this is a b-tree heap table?
[25 Oct 2008 0:02] Mark Callaghan
You should have a test that confirms that index stats are available and set to reasonable
values for myisam and temp tables.
[27 Oct 2008 9:17] Sveta Smirnova
Mark, thank you for the feedback.

Verified as described with all major versions.
[16 Dec 2008 19:59] Sveta Smirnova
Bug #40983 was marked as duplicate of this one.