Bug #90733 TheMySQL server has hanged up when query on the sys.innodb_buffer_stats_by_table
Submitted: 3 May 2018 9:37 Modified: 11 Jun 2018 14:19
Reporter: sudheer gadipathi Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.7.21 OS:Any
Assigned to: CPU Architecture:Any
Tags: innodb_buffer_stats_by_table

[3 May 2018 9:37] sudheer gadipathi
Description:
The MySQL server has hanged up when query on  sys.innodb_buffer_stats_by_table.

We have 5 similar databases and we work like to find out 4 table's buffer pool utilisation using the following query. 

 SELECT sum(pages)  FROM sys.innodb_buffer_stats_by_table innodb_buffer_stats_by_table  where  
 object_name like 'table1%' or  
 object_name like 'table2%' or 
 object_name like 'table3%' or 
 object_name like 'table4%' 
 where object_schema= database();

we have fired the query parallel on 5 databases then we did not observe response with in 2 minutes  then we killed them. After in process list these queries are in killed state and other query are not running and hanged up and not responding to application.   
 

How to repeat:
If you run the following query in parallel  more then 5 database on high load server then try to kill them. 

 SELECT sum(pages)  FROM sys.innodb_buffer_stats_by_table innodb_buffer_stats_by_table  where  
 object_name like 'table1%' or  
 object_name like 'table2%' or 
 object_name like 'table3%' or 
 object_name like 'table4%' 
 where object_schema= database();
[3 May 2018 9:51] sudheer gadipathi
also we can repeat this issue using this query

SELECT sum(pages)  FROM sys.innodb_buffer_stats_by_table innodb_buffer_stats_by_table  where  object_name like '%bodies%'
[3 May 2018 13:56] MySQL Verification Team
probably not a bug, but it should at least be documented similarly to 

https://dev.mysql.com/doc/refman/8.0/en/innodb-buffer-page-table.html

"Querying the INNODB_BUFFER_PAGE table can introduce significant performance overhead. Do not query this table on a production system unless you are aware of the performance impact that your query may have, and have determined it to be acceptable. To avoid impacting performance, reproduce the issue you want to investigate on a test instance and query the INNODB_BUFFER_PAGE table on the test instance."
[11 May 2018 11:52] MySQL Verification Team
Hi,

Thank you for your bug report.

I have tried to repeat it with my tables and schemas and I encountered no problems. Can you tell us how many schemas do you have , how many tables do you have, what sizes, what Storage Engine and other info which would help us reproduce this.

We also need your configuration.

Right now, we are not able to reproduce it.
[14 May 2018 8:34] sudheer gadipathi
We have around 200 tables and each table has 180 hash partitions. I saw other locked threads waited for "Opening file" state.  If you generate huge load and try to fire query on innodb_buffer_stats_by_table table may help to reproduce this issue. 
Evan "kill" did not help for this case.
[14 May 2018 12:39] MySQL Verification Team
Hi,

Are these native InnoDB partitions, or partitions created by older versions, like 5.6. 

Second, how many schemas have you got and how many tables, at average in each schema.

Also, send us the structure of the most common partition table that you have. SHOW CREATE TABLE, would be fine .....
[15 May 2018 10:33] sudheer gadipathi
We have migrated our databases from MySQL 5.6 to MySQL 5.7 in last Dec 2017. we have 5 databases on each MySQL server all databases structures are identical. In each database, we have 200+ tables and each table has 180 hash partitions 

One of the high usage tables is

CREATE TABLE `table1` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `col2` longtext COLLATE utf8_unicode_ci,
  `col3` bigint(20) DEFAULT NULL,
  `col4` bigint(20) DEFAULT NULL,
  `col55` bigint(20) DEFAULT '1',
  `col6` tinyint(1) DEFAULT '0',
  `col42` int(11) DEFAULT '0',
  `col12` tinyint(1) DEFAULT '0',
  `col13` tinyint(1) DEFAULT '0',
  `col9` datetime DEFAULT NULL,
  `col10` datetime DEFAULT NULL,
  `col41` tinyint(1) DEFAULT '0',
  `col5` bigint(20) NOT NULL DEFAULT '0',
  `col16` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `col7` bigint(20) DEFAULT NULL,
  `col15` bigint(20) DEFAULT NULL,
  `col11` bigint(20) DEFAULT NULL,
  `col14` datetime DEFAULT NULL,
  `col17` datetime DEFAULT NULL,
  `col18` tinyint(1) DEFAULT '0',
  `col19` bigint(20) DEFAULT '1',
  `col20` tinyint(1) DEFAULT '0',
  `col21` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `col22` bigint(20) DEFAULT NULL,
  `col23` text COLLATE utf8_unicode_ci,
  `col24` tinyint(1) NOT NULL DEFAULT '1',
  `col8` bigint(20) DEFAULT NULL,
  `col25` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `col26` longtext COLLATE utf8_unicode_ci,
  `col27` bigint(20) DEFAULT NULL,
  `col28` int(11) DEFAULT '0',
  `col29` bigint(20) DEFAULT NULL,
  `col30` bigint(20) DEFAULT NULL,
  `col31` bigint(20) DEFAULT NULL,
  `col32` bigint(20) DEFAULT NULL,
  `col33` bigint(20) DEFAULT NULL,
  `col34` bigint(20) DEFAULT NULL,
  `col35` bigint(20) DEFAULT NULL,
  `col36` bigint(20) DEFAULT NULL,
  `col37` bigint(20) DEFAULT NULL,
  `col38` bigint(20) DEFAULT NULL,
  `col39` bigint(20) DEFAULT NULL,
  `col40` int(11) DEFAULT '0',
  PRIMARY KEY (`id`,`col5`),
  UNIQUE KEY `index_table1_on_col5_and_col7` (`col5`,`col7`),
  UNIQUE KEY `index_table1_on_col5_and_col8` (`col5`,`col8`),
  KEY `index_table1_on_col5_and_col9_and_id` (`col5`,`col9`,`id`),
  KEY `index_table1_on_col5_and_col14_and_id` (`col5`,`col14`,`id`),
  KEY `index_table1_on_col5_and_col10_and_id` (`col5`,`col10`,`id`),
  KEY `index_table1_col5_and_col5` (`col5`,`col55`),
  KEY `index_col5_and_col4_and_col5_col9` (`col5`,`col4`,`col55`,`col9`),
  KEY `index_col5_and_col4_and_col9` (`col5`,`col4`,`col9`),
  KEY `index_col5_col3_col10` (`col5`,`col3`,`col10`),
  KEY `index_table1_on_col5_and_col11` (`col5`,`col11`),
  KEY `idx__col5_col5_col12_col13_col11` (`col5`,`col55`,`col12`,`col13`,`col11`)
) ENGINE=InnoDB AUTO_INCREMENT=13049965771 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
/*!50100 PARTITION BY HASH (col5)
PARTITIONS 180 */
[15 May 2018 14:13] MySQL Verification Team
Hi,

I agree with my colleague Shane Bester.

Your query has to do so much work in order to return the result. Your query has to go through 40.000 partition, using non-mergable views from the SYS tables (thus needing a large amount of temporary tables), scanning all info. 

For this query to finish faster, you would have to have latest Intel processor at high frequency and SSD as an external storage. 

Besides, as this query is only doing SELECTs, it can use only one core or hyper-thread.

However, this all has to be documented properly.

Verified as a documentation bug.
[15 May 2018 14:26] sudheer gadipathi
I agree with you but my worry is not taking more time to complete. The Query is not killed after fired "KILL". I have fired kill command after 3 sec of the query execution started but it did not kill and causes a production problem.  We are using AWS RDS with SSD disks. It is not normal behavior to create  temptable 

I feel there is a bug.

Is kill command responding in your test case?
Why is kill not responding? 
But in MySQL document, we missed this details in "Warning area".
[15 May 2018 16:27] MySQL Verification Team
No, it is not a bug.

I could not kill my query either ..... After 5 minutes, I gave up .....

This is expected behaviour. Query is killable only when it comes into state where there are no temporary tables or temporary files, where memory is released and garbage collection doable.

Hence checks for the signal kill status are done only in the places where a query can be killed safely.

But, this should also be documented.
[11 Jun 2018 14:19] Daniel Price
Posted by developer:
 
The following warning was added to these sys Schema pages:
https://dev.mysql.com/doc/refman/5.7/en/sys-innodb-buffer-stats-by-table.html
https://dev.mysql.com/doc/refman/5.7/en/sys-innodb-buffer-stats-by-schema.html
 
"Querying views that access the INNODB_BUFFER_PAGE table can affect
performance. Do not query these views on a production system unless you
are aware of the performance impact and have determined it to be
acceptable. To avoid impacting performance on a production system,
reproduce the issue you want to investigate and query buffer pool
statistics on a test instance."

The query noted in the bug report uses SUM(), which is a aggregate (GROUP BY) function. 
The KILL Syntax documentation states the following:
 
"When you use KILL, a thread-specific kill flag is set for the thread. In most cases, it might take some time for the thread to die because the kill flag is checked only at specific intervals:

During SELECT operations, for ORDER BY and GROUP BY loops, the flag is checked after reading a block of rows."

https://dev.mysql.com/doc/refman/5.7/en/kill.html
 
Changes should appear online soon.

Thank you for the bug report.