Bug #107145 persistent statistics not update auto
Submitted: 27 Apr 2022 15:23 Modified: 12 May 2022 8:52
Reporter: peng gao Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.28 8.0.23 OS:Any
Assigned to: CPU Architecture:Any

[27 Apr 2022 15:23] peng gao
Description:
Hi :
  recently,we find a table have 2000W rows, but persistent statistics from  not update ,is 1500W from select * from innodb_table_stats
and last_update is 2021.. mysql  variables related to  statistics is default.
  
  in function row_update_statistics_if_needed condition 
  
  counter = table->stat_modified_counter++;
  n_rows = dict_table_get_n_rows(table);
  ..
  counter > n_rows / 10 /* 10% */
  
  
  but table->stat_modified_counter will reset to 0 after flush tables command or table share weed out .

  dict_stats_deinit will set table->stat_initialized = FALSE
  
#0  dict_stats_deinit (table=0x7fffb0408a50) at /newdata/mysql-8.0.23/storage/innobase/include/dict0stats.ic:177
#1  0x00000000053bb48d in dict_table_close (table=0x7fffb0408a50, dict_locked=0, try_drop=0) at /newdata/mysql-8.0.23/storage/innobase/dict/dict0dict.cc:546
#2  0x00000000053da1d6 in dd_table_close (table=0x7fffb0408a50, thd=0x0, mdl=0x0, dict_locked=false) at /newdata/mysql-8.0.23/storage/innobase/dict/dict0dd.cc:1043

  and dict_stats_update will set t->stat_modified_counter = 0;
  
#0  dict_stats_empty_table (table=0xb3d3ca0) at /newdata/mysql-8.0.23/storage/innobase/dict/dict0stats.cc:420
#1  0x000000000541c77b in dict_stats_fetch_from_ps (table=0xb3d3ca0) at /newdata/mysql-8.0.23/storage/innobase/dict/dict0stats.cc:2692
#2  0x000000000541cd5c in dict_stats_update (table=0xb368410, stats_upd_option=DICT_STATS_FETCH_ONLY_IF_NOT_IN_MEMORY) at /newdata/mysql-8.0.23/storage/innobase/dict/dict0stats.cc:2908
#3  0x0000000004f0fcdc in dict_stats_init (table=0xb368410) at /newdata/mysql-8.0.23/storage/innobase/include/dict0stats.ic:164
#4  0x0000000004f1e732 in ha_innobase::open (this=0xabdfb28, name=0xaca6390 "./test0228/mytest", open_flags=2, table_def=0xb3c4e28)
    at /newdata/mysql-8.0.23/storage/innobase/handler/ha_innodb.cc:6904
#5  0x0000000003b70ae8 in handler::ha_open (this=0xabdfb28, table_arg=0xb3c5cd0, name=0xaca6390 "./test0228/mytest", mode=2, test_if_locked=2, table_def=0xb3c4e28)
    at /newdata/mysql-8.0.23/sql/handler.cc:2802
    
can we save this stat_modified_counter in persistent statistics,if read this counter from persistent statistics if need?
thanks!!

How to repeat:
session 1:

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.28    |
+-----------+

mysql> create table test(id int);
Query OK, 0 rows affected (0.34 sec)

mysql> insert into test values(10);
Query OK, 1 row affected (0.00 sec)

mysql> insert into test select * from test;
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> insert into test select * from test;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into test select * from test;
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> insert into test select * from test;
Query OK, 8 rows affected (0.00 sec)
Records: 8  Duplicates: 0  Warnings: 0

mysql> insert into test select * from test;
Query OK, 16 rows affected (0.01 sec)
Records: 16  Duplicates: 0  Warnings: 0

mysql> insert into test select * from test;
Query OK, 32 rows affected (0.00 sec)
Records: 32  Duplicates: 0  Warnings: 0

mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
|       64 |
+----------+
1 row in set (0.01 sec)

session 2:

mysql> select * from innodb_table_stats where table_name='test' and database_name='testup';
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name | last_update         | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| testup        | test       | 2022-04-27 23:08:11 |     64 |                    1 |                        0 |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
1 row in set (0.00 sec)

session 1:

every 5 rows insert then flush tables; 

mysql> insert into test values(10),(10),(10),(10),(10);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> flush tables;
Query OK, 0 rows affected (0.01 sec)

....

mysql> insert into test values(10),(10),(10),(10),(10);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> flush tables;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into test values(10),(10),(10),(10),(10);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> flush tables;
Query OK, 0 rows affected (0.00 sec)

now we have 139 rows.

mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
|      139 |
+----------+
1 row in set (0.01 sec)

session 2:

but statistics is not update still n_rows is 64, if the table is 2000w rows, 1/10 is 200W, if every day backup
data ,some backup use FTWRL , so statistics is not update auto.

mysql> select * from innodb_table_stats where table_name='test' and database_name='testup';
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name | last_update         | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| testup        | test       | 2022-04-27 23:08:11 |     64 |                    1 |                        0 |
+---------------+------------+---------------------+--------+----------------------+--------------------------+

session 1:

mysql> insert into test values(10),(10),(10),(10),(10);
Query OK, 5 rows affected (0.01 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> insert into test values(10),(10),(10),(10),(10);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> insert into test values(10),(10),(10),(10),(10);
Query OK, 5 rows affected (0.01 sec)
Records: 5  Duplicates: 0  Warnings: 0

session 2:

now the statistics is update normal.

mysql>  select * from mysql.innodb_table_stats where table_name='test' and database_name='testup';
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name | last_update         | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| testup        | test       | 2022-04-27 23:14:47 |    149 |                    1 |                        0 |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
1 row in set (0.00 sec)

Suggested fix:
can we save this stat_modified_counter in persistent statistics,if read this counter from persistent statistics if need?
thanks!!
[28 Apr 2022 2:19] huahua xu
You can recalculate exact statistics on the table by the command 'analyze table test'
[28 Apr 2022 9:52] peng gao
thanks huahua xu,I kown exec "analyze table " can recalculate,But I think it work auto is better.
[28 Apr 2022 13:36] MySQL Verification Team
HI Mr. gao,

Thank you for your bug report.

What you have reported is actually a feature request, so we have changed the severity to reflect true severity of your report.

However, we do not see how we could implement your feature request. First of all, FLUSH and other commands do have to set that field to 0.

Also, the recommendation that you make on the recalculation of the statistics after 10 % of the rows are changed is not feasible. It would demand lots of processing power without much advantage. Requests like yours have been made in the past and have not been accepted for the reasoning that we are expounding here.

Second, InnoDB is a MVCC storage engine, so there is no persistent statistics for the table. Simply, each session has its own view on the tables, including the number of rows.

Hence, we do not see how we can accomodate your request.
[28 Apr 2022 13:49] peng gao
thank you!
[9 May 2022 10:07] Øystein Grøvlen
Check http://oysteing.blogspot.com/2021/09/run-analyze-table-do-not-rely-on.html
[9 May 2022 10:32] peng gao
Thank you, I will read the article carefully.
[9 May 2022 12:15] MySQL Verification Team
Thanks ....

We agree that this article is very useful.
[12 May 2022 8:52] peng gao
Hi  Øystein Grøvlen

 I read your article carefully. In comments 
 "Thank you for a great article. One thing that you did not mention is that the FLUSH TABLES statement clears the table definition cache of the SQL layer (but not in InnoDB). In InnoDB, whenever the table reference count reaches 0, a flag will be cleared so that next time the table is going to be opened, the persistent statistics will be read from the tables. I understood that the intended use of this is to make the DBMS aware of manual changes to the InnoDB statistics tables." is my issue. 
thanks...