MySQL Bugs: #99924: The record per key value from InnoDB is not suitable when n

Bug #99924	The record per key value from InnoDB is not suitable when n_diff is zero
Submitted:	18 Jun 2020 3:27	Modified:	23 Jun 2020 12:11
Reporter:	Ze Yang (OCA)	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S5 (Performance)
Version:	8.0	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
When server get the dict_index_t statistics from InnoDB, the flag is set HA_STATUS_NO_LOCK.
So, when server read the innodb_rec_per_key, the n_diff value may be 0 which is the value not set and the table->stat_n_rows may be set.

When n_diff is 0, InnoDB set rec_per_key to records(which is table->stat_n_rows).
This is not suitable for index rec_per_key.

```
/** Calculate Record Per Key value. Excludes the NULL value if
innodb_stats_method is set to "nulls_ignored"
@param[in]  index   dict_index_t structure
@param[in]  i   column we are calculating rec per key
@param[in]  records   estimated total records
@return estimated record per key value */
rec_per_key_t innodb_rec_per_key(const dict_index_t *index, ulint i,
                                 ha_rows records) {
  rec_per_key_t rec_per_key;
  ib_uint64_t n_diff;

  ut_a(index->table->stat_initialized);

  ut_ad(i < dict_index_get_n_unique(index));
  ut_ad(!dict_index_is_spatial(index));

  if (records == 0) {
    /* "Records per key" is meaningless for empty tables.
    Return 1.0 because that is most convenient to the Optimizer. */
    return (1.0);
  }

  n_diff = index->stat_n_diff_key_vals[i];

  if (n_diff == 0) {
    rec_per_key = static_cast<rec_per_key_t>(records);
  }
```

In server code, when not rec_per_key value, it assume rec_per_key is 1 or 10. I think this is more suitable for index rec_per_key value.

How to repeat:
n_diff.test

```
set cte_max_recursion_depth=1000000;
CREATE TABLE t1 AS WITH RECURSIVE t(a, b,c, d) AS (     SELECT 1, 1, 1, 1 UNION ALL     SELECT a+1, b+1, c+1, d+1 FROM t WHERE a < 50000 ) SELECT a, b,c ,d FROM t;
ALTER TABLE t1 ADD index i_b(b);
ANALYZE TABLE t1;
SET @@eq_range_index_dive_limit=1;
explain select * from t1  force index (i_b) where b in (1, 2, 3, 4);
update mysql.innodb_index_stats set stat_value=0 where database_name='test' and table_name='t1' and index_name='i_b' and stat_name='n_diff_pfx01';
FLUSH TABLES t1;
explain select * from t1  force index (i_b);
explain select * from t1  force index (i_b) where b in (1, 2, 3, 4);
explain select * from t1  force index (i_b) where b in (1, 2, 3, 4, 5);
DROP TABLE t1;
```

n_diff.result
The records estimated for `explain select * from t1  force index (i_b) where b in (1, 2, 3, 4);` is not suitable.

```
set cte_max_recursion_depth=1000000;
CREATE TABLE t1 AS WITH RECURSIVE t(a, b,c, d) AS (     SELECT 1, 1, 1, 1 UNION ALL     SELECT a+1, b+1, c+1, d+1 FROM t WHERE a < 50000 ) SELECT a, b,c ,d FROM t;
ALTER TABLE t1 ADD index i_b(b);
ANALYZE TABLE t1;
Table Op  Msg_type  Msg_text
test.t1 analyze status  OK
SET @@eq_range_index_dive_limit=1;
explain select * from t1  force index (i_b) where b in (1, 2, 3, 4);
id  select_type table partitions  type  possible_keys key key_len ref rows  filtered  Extra
1 SIMPLE  t1  NULL  range i_b i_b 9 NULL  4 100.00  Using index condition
Warnings:
Note  1003  /* select#1 */ select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t1`.`c` AS `c`,`test`.`t1`.`d` AS `d` from `test`.`t1` FORCE INDEX (`i_b`) where (`test`.`t1`.`b` in (1,2,3,4))
update mysql.innodb_index_stats set stat_value=0 where database_name='test' and table_name='t1' and index_name='i_b' and stat_name='n_diff_pfx01';
FLUSH TABLES t1;
explain select * from t1  force index (i_b);
id  select_type table partitions  type  possible_keys key key_len ref rows  filtered  Extra
1 SIMPLE  t1  NULL  ALL NULL  NULL  NULL  NULL  49970 100.00  NULL
Warnings:
Note  1003  /* select#1 */ select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t1`.`c` AS `c`,`test`.`t1`.`d` AS `d` from `test`.`t1` FORCE INDEX (`i_b`)
explain select * from t1  force index (i_b) where b in (1, 2, 3, 4);
id  select_type table partitions  type  possible_keys key key_len ref rows  filtered  Extra
1 SIMPLE  t1  NULL  range i_b i_b 9 NULL  199880  100.00  Using index condition
Warnings:
Note  1003  /* select#1 */ select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t1`.`c` AS `c`,`test`.`t1`.`d` AS `d` from `test`.`t1` FORCE INDEX (`i_b`) where (`test`.`t1`.`b` in (1,2,3,4))
explain select * from t1  force index (i_b) where b in (1, 2, 3, 4, 5);
id  select_type table partitions  type  possible_keys key key_len ref rows  filtered  Extra
1 SIMPLE  t1  NULL  range i_b i_b 9 NULL  249850  100.00  Using index condition
Warnings:
Note  1003  /* select#1 */ select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t1`.`c` AS `c`,`test`.`t1`.`d` AS `d` from `test`.`t1` FORCE INDEX (`i_b`) where (`test`.`t1`.`b` in (1,2,3,4,5))
DROP TABLE t1;
```

Suggested fix:
Like server code  guess_rec_per_key, set rec_per_key to 1 or 10.

rec_per_key_t guess_rec_per_key(const TABLE *const table, const KEY *const key,
                                uint used_keyparts) {
  DBUG_ASSERT(used_keyparts >= 1);
  DBUG_ASSERT(used_keyparts <= key->actual_key_parts);
  DBUG_ASSERT(!key->has_records_per_key(used_keyparts - 1));

  const ha_rows table_rows = table->file->stats.records;

  /*
    Make an estimates for how many records the whole key will match.
    If there exists index statistics for the whole key we use this.
    If not, we assume the whole key matches ten records for a non-unique
    index and 1 record for a unique index.
  */
  rec_per_key_t rec_per_key_all;
  if (key->has_records_per_key(key->user_defined_key_parts - 1))
    rec_per_key_all = key->records_per_key(key->user_defined_key_parts - 1);
  else {
    if (key->actual_flags & HA_NOSAME)
      rec_per_key_all = 1.0f;  // Unique index
    else {
      rec_per_key_all = 10.0f;  // Non-unique index

}

Hi Mr. Yang,

Thank you for your bug report.

Let me see if I understood you correctly. You would like to see a change where number of records per key is 1 or 10 depending on whether index is unique or not.

However, in our optimiser, dives into indices are made which come up which much precise estimates then those. Hence, I do not see a need for this small change.

Also, you have seen this small performance problem in InnoDB SE, while the relevant part of the calculus is performed in the Optimiser.

Can you clear out your report further in order to make it more precise.

Thanks in advance.

Hi,

Only one more question.

With your patch, what were exactly the benefits that you have observed ??? Have you made any measurements ?? What was number of rows matched and total number of rows ????

Sorry that I do not express my point clearly.

This problem will cause optimizer choose full table scan rather than index read with small range.

As server read dict_table statistics from InnoDB not thread safe. It may read stat_n_rows is not zero and n_diff is zero.
When n_diff is zero,  the rec_per_key is stat_n_rows.
This will cause the index read records estimated by optimizer is very large(larger than the table records).

The optimizer will choose full table scan rather index.

explain select * from t1  where b in (1, 2, 3, 4);
id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	t1	NULL	ALL	i_b	NULL	NULL	NULL	49970	100.00	Using where

Hi Mr. Yang,

We still do not understand your report.

You know, it is irrelevant whether number of rows returned is one or 10. Depending on the average record width, it is much faster to scan rows than to search by index on the small tables. Which means if you have only 50 or 100 rows, in many cases scanning is much faster then index search.

Also, will you be so kind and prove to us that when server reads dict_table statistics from InnoDB , those reads are not thread safe. Please use 8.0.20 code base to prove your point.

I will try to summarize my understanding of this:

1. The server call handler::info_low() to get updated statistics. To update its rec_per_key statistics, it sets HA_STATUS_CONST flag. This happens, for example, when a new table object is created. This will access dict_index_t::stat_n_diff_key_vals without any form of thread synchronization. (see innodb_rec_per_key()).

2. When the background thread refreshes its statistics, it starts with setting dict_index_t::stat_n_diff_key_vals to 0. See dict_stats_empty_index() called from dict_stats_analyze_index().

3. In innodb_rec_per_key() if stat_n_diff_key_vals is 0, it will return the total number of records instead.

4. In other words, if a table object is opened during the recalculation of statistics, the rec_per_key for a column/index may be quite misleading. It will be interpreted as all rows have the same value, and the index will probably not be chosen for any non-covering scans.

5. Note also that there is no guarantee that the servers misleading rec_per_key data will be refreshed when the background thread has finished the recalculation. Unless an explicit ANALYZE TABLE or FLUSH TABLE is performed, a refresh basically depends on a new table object being created. If there are cached table objects available for reuse, this may not happen for quite some time.

6. The particular use case in this bug report, is a query with an IN expression with a long list of values. If the size of the list is greater than eq_range_index_dive_limit, rec_per_key statistics will be used instead of index dives. If the number of rows in the table is n_rows, and we have the scenario described above, the optimizer will think that there is n_rows per range, and prefer table scan instead.

7. The suggestion in this request is to use another default than n_rows when info_low() is called and stat_n_diff_key_vals is 0. The server will, if rec_per_key is not provided, use 1 for an unique index and 10 for a non-unique index (See guess_rec_per_key()). The proposal is to do the same when stat_n_diff_key_vals is (temporarily) 0.

Hi Øystein,

I think that you are correct and what you described is exactly my understanding.

However, as I wrote above, there is no proof that this change will render faster execution in all situations, when this part of code has to be executed.

No change will "render faster execution in all situations".  However, when there is lack of information, the optimizer will need to make some assumptions that can cover a large part of the cases.  Assuming that all rows have the same value, is not a good general assumption.

I think the best would be if InnoDB could just set the value REC_PER_KEY_UNKNOWN in such cases.  Then, the optimizer could decide itself how to deal with it.

HI  Øystein Grøvlen,

I agree with you.

However, I do not think that this is a bug, but a performance improvement report.

Verified as performance improvement.