Bug #95507 innodb_stats_method is not honored when innodb_stats_persistent=ON
Submitted: 23 May 21:34 Modified: 24 May 6:12
Reporter: Sergei Petrunia Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0, 8.0.16, 5.7.26 OS:Any
Assigned to: CPU Architecture:Any

[23 May 21:34] Sergei Petrunia
Description:
The value of innodb_stats_method is not honored when innodb_stats_persistent=ON (which is the default). It works as if innodb_stats_method=nulls_equal.

How to repeat:
== Part#1: let's use non-persistent stats and see how things should work ==
set global innodb_stats_persistent=off;
set global innodb_stats_transient_sample_pages=100;
set global innodb_stats_persistent_sample_pages=100;
show variables like 'innodb%stats%';
+--------------------------------------+-------------+   
| Variable_name                        | Value       |   
+--------------------------------------+-------------+   
| innodb_dict_stats_disabled_debug     | OFF         |   
| innodb_stats_auto_recalc             | ON          |   
| innodb_stats_include_delete_marked   | OFF         |   
| innodb_stats_method                  | nulls_equal |   
| innodb_stats_on_metadata             | OFF         |   
| innodb_stats_persistent              | OFF         |   
| innodb_stats_persistent_sample_pages | 100         |   
| innodb_stats_transient_sample_pages  | 100         |   
+--------------------------------------+-------------+   

create table ten(a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table one_k(a int);
insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
create table t1 (pk int primary key , a int, key(a)) engine=innodb;
 # 400 K rows with NULLs:
insert into t1 select A.a+1000*B.a, null from one_k A , one_k B where B.a< 400;
# 101K rows with different non-NULL values:
insert into t1
select
  A.a+1000*B.a,A.a+1000*B.a from one_k A, one_k B
where B.a between 400 and 500;
analyze table t1;

mysql> show keys from t1;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| t1    |          0 | PRIMARY  |            1 | pk          | A         |      523081 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| t1    |          1 | a        |            1 | a           | A         |       97346 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+

### Cardinality of a = 97K. Close to 100K, good.

set global innodb_stats_method=nulls_unequal;
analyze table t1;

mysql> show keys from t1;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| t1    |          0 | PRIMARY  |            1 | pk          | A         |      482866 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| t1    |          1 | a        |            1 | a           | A         |      482866 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+

### Cardinality of a is now 482K, which is close to 500K. Good, it works as
### expected.

==== part#2: Now, let's try the same with persistent InnoDB stats ====
drop table t1;
set global innodb_stats_persistent=on;
set global innodb_stats_method=nulls_equal;

mysql> show variables like 'innodb%stats%';
+--------------------------------------+-------------+
| Variable_name                        | Value       |
+--------------------------------------+-------------+
| innodb_dict_stats_disabled_debug     | OFF         |
| innodb_stats_auto_recalc             | ON          |
| innodb_stats_include_delete_marked   | OFF         |
| innodb_stats_method                  | nulls_equal |
| innodb_stats_on_metadata             | OFF         |
| innodb_stats_persistent              | ON          |
| innodb_stats_persistent_sample_pages | 100         |
| innodb_stats_transient_sample_pages  | 100         |
+--------------------------------------+-------------+

create table t1 (pk int primary key , a int, key(a)) engine=innodb;
insert into t1 select A.a+1000*B.a, null from one_k A , one_k B where B.a< 400;
insert into t1
select
  A.a+1000*B.a,A.a+1000*B.a from one_k A, one_k B
where B.a between 400 and 500;
analyze table t1;

mysql> show keys from t1;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| t1    |          0 | PRIMARY  |            1 | pk          | A         |      518296 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| t1    |          1 | a        |            1 | a           | A         |      100541 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+

##### Ok, Cardinality of a is ~100K. Works as expected.

##### Now, let's tell it to consider NULLs unequal.

mysql> set global innodb_stats_method=nulls_unequal;
mysql> analyze table t1;

mysql> show keys from t1;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| t1    |          0 | PRIMARY  |            1 | pk          | A         |      548692 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| t1    |          1 | a        |            1 | a           | A         |      100541 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+

### Cardinality (a) is still 100K ?

(It is re-calculated. I can set a different value for
innodb_stats_persistent_sample_pages and re-run the command. I get a slightly
different cardinality number but still, it is around 100K, while it should be around
500K).
[24 May 6:12] Umesh Shastry
Hello Sergei,

Thank you for the report and test case.
Verified as described.

regards,
Umesh
[28 Jul 22:40] Jesper wisborg Krogh
Posted by developer:
 
A possible workaround is to use innodb_stats_method = nulls_ignored which is not ignored. This makes InnoDB only consider non-NULL values but still expand the statistics to cover all rows, so the cardinality will will be as all rows had a not-NULL value with the values distributed among the values found during the index statistics collection.