MySQL Bugs: #54398: Explain output for estimated number of rows varies between runs on InnoDB

Bug #54398	Explain output for estimated number of rows varies between runs on InnoDB
Submitted:	10 Jun 2010 11:38	Modified:	25 Jun 2010 10:00
Reporter:	Olav Sandstå	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	Tests: Server	Severity:	S3 (Non-critical)
Version:	5.5.5, 5.6.99	OS:	Any
Assigned to:	Jimmy Yang	CPU Architecture:	Any
Tags:	pb2

Description:
When running the MTR test innodb_mrr against the mysql-next-mr-opt-backporting tree it fails randomly (about 2 out of 3 runs) with the following error:

explain select * from t2 force index (d) where d < 10;
 id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
-1	SIMPLE	t2	range	d	d	5	NULL	53	Using index condition; Using MRR
+1	SIMPLE	t2	range	d	d	5	NULL	47	Using index condition; Using MRR

The difference is in the explain output where the key_len field varies between being either 47 or 53. 

How to repeat:
1. Check out the source code from mysql-next-mr-opt-backporting

2. Compile, eg.

     cmake . -DWITH_DEBUG=1 -DWITH_INNOBASE_STORAGE_ENGINE=1
     make

3. Run test:

     ./mtr innodb_mrr

The test failure is still present if I disable both MRR and ICP when running the test:

  set session optimizer_switch='mrr=off,index_condition_pushdown=off';

Running the innodb_mrr test from the 6.0 codebase against mysql-next-mr-bugfixing tree also produce the same variation in the key_len field in the explain output:

 explain select * from t2 force index (d) where d < 10;
 id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
-1	SIMPLE	t2	range	d	d	5	NULL	53	Using where
+1	SIMPLE	t2	range	d	d	5	NULL	47	Using where

Updating the MySQL version to include mysql-next-mr.

Simplified test case for reproducing this against mysql-next-mr:

-- source include/have_innodb.inc

create table t1(a int);
insert into t1 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

# Try a very big rowid
create table t2 (a char(100), b char(100), c char(100), d int, 
                 filler char(10), key(d), primary key (a,b,c)) engine= innodb;
insert into t2 select A.a, B.a, B.a, A.a, 'filler' from t1 A, t1 B;
explain select * from t2 force index (d) where d < 10;

drop table t2;
drop table t1;

This bug is also present in 5.5.5 (using source from mysql-trunk-bugfixing) using the simplified test case.

I am not able to reproduce this behavior when using 5.1.48 (based on the latest source from mysql-5.1-bugteam).

A similar difference in explain output is also reported in Bug#45727.

There is a test case in subselect_sj2.test (in mysql-next-mr-opt-backporting)
(soon to be renamed subquery_sj_innodb.inc) that fails in a similar way
if I set all new optimizer switches to off:

set optimizer_switch='semijoin=off,materialization=off,firstmatch=off,loosescan=off,index_condition_pushdown=off,mrr=off';

create table t0 (a int);
insert into t0 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

# Test overflow to MyISAM:
set max_heap_table_size=16384;
set join_buffer_size= 8192;

create table t3 (
   a int, 
   b int,
   key(b),
   pk1 char(200), pk2 char(200),
   primary key(pk1, pk2)
) engine=innodb;
insert into t3 select 
  A.a + 10*B.a, A.a + 10*B.a, A.a + 10*B.a, A.a + 10*B.a 
from t0 A, t0 B where B.a <5;

# -- [DISABLED Bug#54398]
#explain select * from t3 where b in (select a from t0);
select * from t3 where b in (select A.a+B.a from t0 A, t0 B where B.a<5);

The parameter is called innodb_stats_sample_pages. See http://www.innodb.com/doc/innodb_plugin-1.0/innodb-other-changes.html#innodb-other-changes... for details.

This is not a bug. As mentioned previously, it is due to the #rows comes from statistics.

The number of rows displayed in the explain result comes from a ha_innobase::info call to get innodb's table->stat_n_rows value.

The table->stat_n_rows is normally incremented in row_insert_for_mysql() for each insert. But every once a while, it is adjusted by statistics value. We call dict_update_statistics() during inserts every onces a while:

row_update_statistics_if_needed()
 
        /* Calculate new statistics if 1 / 16 of table has been modified
        since the last time a statistics batch was run, or if
        stat_modified_counter > 2 000 000 000 (to avoid wrap-around).
        We calculate statistics at most every 16th round, since we may have
        a counter table which is very small and updated very often. */

        if (counter > 2000000000
            || ((ib_int64_t)counter > 16 + table->stat_n_rows / 16)) {

                dict_update_statistics(table);
        }          

}

In btr_estimate_number_of_different_key_vals(), it calls btr_cur_open_at_rnd_pos_func() and page_cur_open_on_rnd_user_rec() to position to a randomly chosen position.

page_cur_open_on_rnd_user_rec()
{
            ...
            rnd = (ulint) (page_cur_lcg_prng() % n_recs);
}

So here comes the randomness, and eventually the estimate could be a few rec more or less than the actual rows inserted, and it becomes increasingly inaccurate as pages increase.

Breakpoint 2, dict_update_statistics_low (table=0x9f4db18, has_dict_mutex=0)
    at /home/jy/work/mysql5.5_6/mysql-trunk-innodb/storage/innobase/dict/dict0dict.c:4212
4212		if (table->ibd_file_missing) {
2: table->stat_n_rows = 79

4268		table->stat_n_rows = index->stat_n_diff_key_vals[
2: table->stat_n_rows = 79
(gdb) p index->stat_n_diff_key_vals[2]
$26 = 77

so you see the rows inserted (table->stat_n_rows )gets ret to 77 after dict_update_statistics_low() call.

The test cases should drop selecting on #rows

Thanks
Jimmy