MySQL Bugs: #111420: DB instance restarted with AWS RDS database

Bug #111420	DB instance restarted with AWS RDS database
Submitted:	14 Jun 2023 12:37	Modified:	14 Jun 2023 13:25
Reporter:	Armand Chabot	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server	Severity:	S2 (Serious)
Version:	8.0.32	OS:	Ubuntu
Assigned to:		CPU Architecture:	Other

Description:
I am able to see the below observed events and noticed that the DB instance was restarted several times.

2023-06-13 20:38:04 UTC DB instance restarted
2023-06-13 20:08:04 UTC DB instance restarted
2023-06-13 19:37:18 UTC DB instance restarted
2023-06-13 19:30:04 UTC DB instance restarted
2023-06-13 19:28:48 UTC DB instance restarted
2023-06-13 12:59:30 UTC DB instance restarted
2023-06-13 12:58:30 UTC DB instance restarted
2023-06-13 12:57:48 UTC DB instance restarted

Based on the CloudWatch metrics and Enhanced Monitoring metrics at the above timestamps, I did not observe any heavy load on the RDS that could have possibly led to the DB Restarts.

Diving deeper into the issue, I reviewed the logs attached by you and observed the below:

mysqld got signal 11 ;

Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x14cbb3a05000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 14e3b8c84360 thread_stack 0x40000
/rdsdbbin/mysql/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x2e) [0x2120dae]
/rdsdbbin/mysql/bin/mysqld(print_fatal_signal(int)+0x343) [0x1067273]
/rdsdbbin/mysql/bin/mysqld(handle_fatal_signal+0xa5) [0x1067325]
/lib64/libpthread.so.0(+0x118e0) [0x14e51c7978e0]
/rdsdbbin/mysql/bin/mysqld(JOIN::get_best_combination()+0x1e8) [0xeea7e8]
/rdsdbbin/mysql/bin/mysqld(JOIN::make_join_plan()+0x5e9) [0xefdae9]
/rdsdbbin/mysql/bin/mysqld(JOIN::optimize(bool)+0xf9d) [0xeffdfd]
/rdsdbbin/mysql/bin/mysqld(Query_block::optimize(THD*, bool)+0xae) [0xf60b7e]
/rdsdbbin/mysql/bin/mysqld(Query_expression::optimize(THD*, TABLE*, bool, bool)+0x9e) [0xfd87be]
/rdsdbbin/mysql/bin/mysqld(Sql_cmd_dml::execute_inner(THD*)+0x30) [0xf5f9d0]
/rdsdbbin/mysql/bin/mysqld(Sql_cmd_dml::execute(THD*)+0x16e) [0xf5f11e]
/rdsdbbin/mysql/bin/mysqld(mysql_execute_command(THD*, bool)+0xb6e) [0xf0906e]
/rdsdbbin/mysql/bin/mysqld(dispatch_sql_command(THD*, Parser_state*)+0x494) [0xf0cde4]
/rdsdbbin/mysql/bin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0xed5) [0xf0e255]
/rdsdbbin/mysql/bin/mysqld(do_command(THD*)+0x227) [0xf105c7]
/rdsdbbin/mysql/bin/mysqld() [0x1058640]
/rdsdbbin/mysql/bin/mysqld() [0x26e8b25]
/lib64/libpthread.so.0(+0x744b) [0x14e51c78d44b]
/lib64/libc.so.6(clone+0x3f) [0x14e51bf7252f]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (14cbb3a682f0): SELECT `e`.*, `cat_index`.`position` AS `cat_index_position`, `perm`.`grant_catalog_category_view`, `perm`.`grant_catalog_product_price`, `perm`.`grant_checkout_items`, `price_index`.`price`, `price_index`.`tax_class_id`, `price_index`.`final_price`, IF(price_index.tier_price IS NOT NULL, LEAST(price_index.min_price, price_index.tier_price), price_index.min_price) AS `minimal_price`, `price_index`.`min_price`, `price_index`.`max_price`, `price_index`.`tier_price`, if (wh_item.current IS NOT NULL, wh_item.current, wh_item.total) AS `is_salable`, `shared_product`.`customer_group_id` FROM `catalog_product_entity` AS `e` INNER JOIN `catalog_category_product_index_store1` AS `cat_index` ON cat_index.product_id=e.entity_id AND cat_index.store_id=1 AND cat_index.visibility IN(3, 2, 4) AND cat_index.category_id=2 LEFT JOIN `magento_catalogpermissions_index_product` AS `perm` ON perm.customer_group_id = '1' AND perm.product_id = cat_index.product_id AND perm.store _id = 1 INNER JOIN `catalog_product_index_price` AS 
Connection ID (thread ID): 310
Status: NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.",prd-sm-m24-rds

The above messages are displayed at 2023-06-13T19:29:25Z UTC, 2023-06-13T19:36:47Z UTC and 2023-06-13T20:07:33Z UTC timestamps which match the timestamps of the DB Restart.

The 'mysqld got signal 11 ;' indicates that the 'MySQL' database process has crashed. You are running community edition of MySQL version MySQL 8.0.32, and it could be possible that you are hitting a 'bug' in this version. Please be informed that we use community version MySQL without making any explicit tweaks for AWS RDS.

Also, the above error messages indicate that the below query is responsible for crashing the database.

SQL query -
------------

SELECT `e`.*, `cat_index`.`position` AS `cat_index_position`, `perm`.`grant_catalog_category_view`, `perm`.`grant_catalog_product_price`, `perm`.`grant_checkout_items`, `price_index`.`price`, `price_index`.`tax_class_id`, `price_index`.`final_price`, IF(price_index.tier_price IS NOT NULL, LEAST(price_index.min_price, price_index.tier_price), price_index.min_price) AS `minimal_price`, `price_index`.`min_price`, `price_index`.`max_price`, `price_index`.`tier_price`, if (wh_item.current IS NOT NULL, wh_item.current, wh_item.total) AS `is_salable`, `shared_product`.`customer_group_id` FROM `catalog_product_entity` AS `e` INNER JOIN `catalog_category_product_index_store1` AS `cat_index` ON cat_index.product_id=e.entity_id AND cat_index.store_id=1 AND cat_index.visibility IN(3, 2, 4) AND cat_index.category_id=2 LEFT JOIN `magento_catalogpermissions_index_product` AS `perm` ON perm.customer_group_id = '1' AND perm.product_id = cat_index.product_id AND perm.store _id = 1 INNER JOIN `catalog_product_index_price` AS 

How to repeat:
it occurs whenever our website gets busy. our website is using this version currently and this has been occurring for a few months now. we also downgraded to v8.0.28 and the issue stopped occurring. AWS Support recommended upgrading the RDS instance to one that supported memory size up to 128GB (was 64GB) hoping this was the issue, however it is still occurring. 

it appears to be a 'bug' with this version. pleaser advise

HI Mr. Chabot,

Thank you for your bug report.

However, we can not repeat it.

We see the query that crashes the server. However, we can not repeat the query without all tables that are involved in this query and we have not got dump of your tables.

When we get all that we need, we shall try to repeat it with 8.0.33 and our production binary.

Can't repeat.