Bug #109542 Multiple different-looking crashes after upgrade to 8.0.30
Submitted: 6 Jan 2023 1:22 Modified: 13 Mar 2023 18:26
Reporter: Ryan Laughlin Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: JSON Severity:S2 (Serious)
Version:8.0.30 OS:Linux
Assigned to: CPU Architecture:Any

[6 Jan 2023 1:22] Ryan Laughlin
Description:
We've had three unexpected MySQL crashes since upgrading from 8.0.21 to 8.0.30 in late October 2022. Unfortunately the crash reports differ quite a bit, so we've struggled to diagnose the issue or find a related bug report, though it seems likely that there's some common root cause.

More info:

- The crash is intermittent, with several weeks between incidents. We have not been able to reproduce manually. Between crashes, this production DB instance has operated normally, processing 10-20K queries per second without issue.

- We are running on AWS RDS, so unfortunately we don't have full access to the instance. We have engaged with AWS support, and so far they assert that they have found no underlying issues with the hardware or OS.

- The only new-ish feature we are using as of MySQL 8.0.30 is a multi-valued index for a JSON column. Perhaps the crashes are related somehow?

Thanks in advance for any help!

Crash 1:

2022-11-07T15:58:06.397411Z 309771 [ERROR] [MY-013183] [InnoDB] Assertion failure: row0mysql.cc:1007:prebuilt->magic_n == ROW_PREBUILT_ALLOCATED thread 70438006490832
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
15:58:06 UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x407b1c717800
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 4010205afe60 thread_stack 0x40000
/rdsdbbin/mysql/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x30) [0x1f7bb30]
/rdsdbbin/mysql/bin/mysqld(print_fatal_signal(int)+0x27c) [0xfaa93c]
/rdsdbbin/mysql/bin/mysqld(my_server_abort()+0x98) [0xfaab18]
/rdsdbbin/mysql/bin/mysqld(my_abort()+0x14) [0x1f74694]
/rdsdbbin/mysql/bin/mysqld(ut_dbg_assertion_failed(char const*, char const*, unsigned long)+0x27c) [0x22a5c7c]
/rdsdbbin/mysql/bin/mysqld(row_update_prebuilt_trx(row_prebuilt_t*, trx_t*)+0xb8) [0x21fcf18]
/rdsdbbin/mysql/bin/mysqld(ha_innobase::update_thd(THD*)+0x64) [0x20d9884]
/rdsdbbin/mysql/bin/mysqld(ha_innobase::external_lock(THD*, int)+0x24) [0x20e2f04]
/rdsdbbin/mysql/bin/mysqld(handler::ha_external_lock(THD*, int)+0x110) [0x10adeb0]
/rdsdbbin/mysql/bin/mysqld(mysql_lock_tables(THD*, TABLE**, unsigned long, unsigned int)+0x1b0) [0x125a4f0]
/rdsdbbin/mysql/bin/mysqld(lock_tables(THD*, TABLE_LIST*, unsigned int, unsigned int)+0x30c) [0xdbae8c]
/rdsdbbin/mysql/bin/mysqld(Sql_cmd_dml::execute(THD*)+0x578) [0xea7878]
/rdsdbbin/mysql/bin/mysqld(mysql_execute_command(THD*, bool)+0xf2c) [0xe5504c]
/rdsdbbin/mysql/bin/mysqld(dispatch_sql_command(THD*, Parser_state*)+0x340) [0xe580c0]
/rdsdbbin/mysql/bin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x1560) [0xe59b40]
/rdsdbbin/mysql/bin/mysqld(do_command(THD*)+0x1e0) [0xe5ae40]
/rdsdbbin/mysql/bin/mysqld() [0xf9cc88]
/rdsdbbin/mysql/bin/mysqld() [0x2518490]
/lib64/libpthread.so.0(+0x722c) [0x40002c85b22c]
/lib64/libc.so.6(+0xd46dc) [0x40002ca4f6dc]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (407b1c538730): SELECT `users`.* FROM `users` WHERE `users`.`id` = 35309001 LIMIT 1
Connection ID (thread ID): 309771
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

Crash 2:

16:38:28 UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x407b19f41800
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...

Crash 3:

22:20:45 UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x407b5f0a3000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 407b6c43de60 thread_stack 0x40000
/rdsdbbin/mysql/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x30) [0x1f7bb30]
/rdsdbbin/mysql/bin/mysqld(print_fatal_signal(int)+0x27c) [0xfaa93c]
/rdsdbbin/mysql/bin/mysqld(handle_fatal_signal+0xc4) [0xfaaa64]
linux-vdso.so.1(__kernel_rt_sigreturn+0) [0x400018f2a668]
/rdsdbbin/mysql/bin/mysqld(row_sel_field_store_in_mysql_format_func(unsigned char*, mysql_row_templ_t const*, dict_index_t const*, unsigned char const*, unsigned long)+0x164) [0x2225a64]
/rdsdbbin/mysql/bin/mysqld(row_sel_store_mysql_rec(unsigned char*, row_prebuilt_t*, unsigned char const*, dtuple_t const*, bool, dict_index_t const*, dict_index_t const*, unsigned long const*, bool, lob::undo_vers_t*, mem_block_info_t*&)+0x19c) [0x2228e3c]
/rdsdbbin/mysql/bin/mysqld(row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long)+0x1a90) [0x222e410]
/rdsdbbin/mysql/bin/mysqld(ha_innobase::index_read(unsigned char*, unsigned char const*, unsigned int, ha_rkey_function)+0x378) [0x20de778]
/rdsdbbin/mysql/bin/mysqld(handler::ha_index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function)+0x3f8) [0x10aef98]
/rdsdbbin/mysql/bin/mysqld(handler::read_range_first(key_range const*, key_range const*, bool, bool)+0x60) [0x10afa20]
/rdsdbbin/mysql/bin/mysqld(ha_innobase::read_range_first(key_range const*, key_range const*, bool, bool)+0x1c) [0x20a955c]
/rdsdbbin/mysql/bin/mysqld(handler::multi_range_read_next(char**)+0x118) [0x10b02d8]
/rdsdbbin/mysql/bin/mysqld(handler::ha_multi_range_read_next(char**)+0x34) [0x10ac174]
/rdsdbbin/mysql/bin/mysqld(IndexRangeScanIterator::Read()+0x3c) [0x12b951c]
/rdsdbbin/mysql/bin/mysqld(IndexMergeIterator::Init()+0x130) [0x12b88f0]
/rdsdbbin/mysql/bin/mysqld(filesort(THD*, Filesort*, RowIterator*, unsigned long, unsigned long long, Filesort_info*, Sort_result*, unsigned long long*)+0x178) [0x1093e18]
/rdsdbbin/mysql/bin/mysqld(SortingIterator::DoSort()+0x54) [0x1200794]
/rdsdbbin/mysql/bin/mysqld(SortingIterator::Init()+0x1c) [0x120081c]
/rdsdbbin/mysql/bin/mysqld(LimitOffsetIterator::Init()+0x20) [0x135dd40]
/rdsdbbin/mysql/bin/mysqld(Query_expression::ExecuteIteratorQuery(THD*)+0x200) [0xf18160]
/rdsdbbin/mysql/bin/mysqld(Query_expression::execute(THD*)+0x30) [0xf185b0]
/rdsdbbin/mysql/bin/mysqld(Sql_cmd_dml::execute(THD*)+0x158) [0xea7458]
/rdsdbbin/mysql/bin/mysqld(mysql_execute_command(THD*, bool)+0xf2c) [0xe5504c]
/rdsdbbin/mysql/bin/mysqld(dispatch_sql_command(THD*, Parser_state*)+0x340) [0xe580c0]
/rdsdbbin/mysql/bin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x1560) [0xe59b40]
/rdsdbbin/mysql/bin/mysqld(do_command(THD*)+0x1e0) [0xe5ae40]
/rdsdbbin/mysql/bin/mysqld() [0xf9cc88]
/rdsdbbin/mysql/bin/mysqld() [0x2518490]
/lib64/libpthread.so.0(+0x722c) [0x400018fe622c]
/lib64/libc.so.6(+0xd46dc) [0x4000191da6dc]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (407ad76b1830): SELECT `fort_knox_funds_flows`.* FROM `fort_knox_funds_flows` WHERE (25830440 MEMBER OF(`fort_knox_funds_flows`.`money_movements` ->> "$[*].to_user_id") OR 25830440 MEMBER OF(`fort_knox_funds_flows`.`money_movements` ->> "$[*].from_user_id")) ORDER BY `fort_knox_funds_flows`.`created_at` DESC LIMIT 20
Connection ID (thread ID): 624306
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

How to repeat:
Unfortunately we have been unable to manually reproduce the issue. It has occurred intermittently in production (1-2 times/month). Between crashes, the DB instance has operated normally, processing 10-20K queries per second without issue.
[6 Jan 2023 10:48] Rafal Staniszewski
We're facing the same issue. Problems started after the upgrade to 8.0.30-22 (percona distribution). mysql.err is attached.
[6 Jan 2023 13:22] Max Irgiznov
Hi all.

Which monitoring of DB do you use? In our case, we disable(enabled only for node)  MySQL service monitoring in PMM and the crashes ended.
[6 Jan 2023 15:33] Ryan Laughlin
We use AWS RDS Performance Insights, but not PMM. We can explore disabling Performance Insights to see if it helps.

I'll upload a few files listing our performance_schema.setup% settings, just in case they're useful. I have no idea if that's the actual cause, but I can imagine how a bug in the Performance Schema might cause multiple crashes that don't bear much resemblance to each other.
[6 Jan 2023 15:57] Max Irgiznov
Hi Ryan.

I tried to disable PFS, but it's not helping.

Now in the testing process, the part related to innodb_monitor_disable=all, because is used in monitoring indicators. Everything looks good, but it takes another 3-4 days to stabilize the result.
[9 Jan 2023 12:50] MySQL Verification Team
Hi Mr. Laughlin,

Thank you very much for your bug report.

We have tried to upgrade from MySQL 8.0.21 to 8.0.31, with our test database, and it all went like a charm. There are several upgrade bugs that are fixed in 8.0.31, as described in the relevant Release Note.

Hence , we recommend you to read all Release notes from 8.0.22 to 8.0.31 for all incompatible changes. You might find a cause of your problems in those changes .......

Next, since we can not repeat the problems that you encounter, we need a fully repeatable test case, including the table / tables involved in those crashes. 

After we are able to repeat the behaviour, on 8.0.31 and 8.0.32 (when it is published), then we shall change the status of this report.
[12 Jan 2023 13:47] Ryan Laughlin
Makes total sense – it's obviously hard for the MySQL team to diagnose a bug without clear reproduction steps. If we're able to isolate a clear reproducible test case, we'll let you know immediately.

For anyone else stumbling across this bug report: our best guess right now is that this crash is related to our use of a multi-valued index on a JSON column, which is a feature that we only started using after this upgrade. Of our 4 total crashes, 2 have mentioned the following query (with different integer values):

SELECT `fort_knox_funds_flows`.* FROM `fort_knox_funds_flows` WHERE (47769504 MEMBER OF(`fort_knox_funds_flows`.`money_movements` ->> "$[*].to_user_id") OR 47769504 MEMBER OF(`fort_knox_funds_flows`.`money_movements` ->> "$[*].from_user_id"))

That table is defined as follows:

CREATE TABLE `fort_knox_funds_flows` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `money_movements` json NOT NULL,
  `enrollment_identifier` binary(16) NOT NULL,
  `created_at` datetime(6) NOT NULL,
  `updated_at` datetime(6) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `index_fort_knox_funds_flows_on_enrollment_identifier` (`enrollment_identifier`),
  KEY `index_fort_knox_funds_flows_on_updated_at` (`updated_at`),
  KEY `money_movement_from_user_ids` ((cast(json_unquote(json_extract(`money_movements`,'$[*].from_user_id')) as signed array))),
  KEY `money_movement_to_user_ids` ((cast(json_unquote(json_extract(`money_movements`,'$[*].to_user_id')) as signed array)))
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci

The `money_movements` JSON data has the following format:

[
  {
    "from_user_id": <integer>,
    "to_user_id": <integer>,
    // plus additional key-value pairs
  },
  // plus additional objects in the above format
]

We suspect that either the `money_movement_from_user_ids` index or the `money_movement_to_user_ids` index is causing the problem somehow. As a prospective fix, we are updating our DB schema and our application logic in order to eliminate the indexes.

--------

Below is the crash log of our 4th and latest crash, which looked somewhat similar to our 3rd crash previously reported above.

07:54:49 UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x407b28bf4000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 407afa3d1e60 thread_stack 0x40000
/rdsdbbin/mysql/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x30) [0x1f7bb30]
/rdsdbbin/mysql/bin/mysqld(print_fatal_signal(int)+0x27c) [0xfaa93c]
/rdsdbbin/mysql/bin/mysqld(handle_fatal_signal+0xc4) [0xfaaa64]
linux-vdso.so.1(__kernel_rt_sigreturn+0) [0x4000092c3668]
/rdsdbbin/mysql/bin/mysqld(row_sel_field_store_in_mysql_format_func(unsigned char*, mysql_row_templ_t const*, dict_index_t const*, unsigned char const*, unsigned long)+0x160) [0x2225a60]
/rdsdbbin/mysql/bin/mysqld(row_sel_store_mysql_rec(unsigned char*, row_prebuilt_t*, unsigned char const*, dtuple_t const*, bool, dict_index_t const*, dict_index_t const*, unsigned long const*, bool, lob::undo_vers_t*, mem_block_info_t*&)+0x19c) [0x2228e3c]
/rdsdbbin/mysql/bin/mysqld(row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long)+0x1a90) [0x222e410]
/rdsdbbin/mysql/bin/mysqld(ha_innobase::index_read(unsigned char*, unsigned char const*, unsigned int, ha_rkey_function)+0x378) [0x20de778]
/rdsdbbin/mysql/bin/mysqld(handler::ha_index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function)+0x3f8) [0x10aef98]
/rdsdbbin/mysql/bin/mysqld(handler::read_range_first(key_range const*, key_range const*, bool, bool)+0x60) [0x10afa20]
/rdsdbbin/mysql/bin/mysqld(ha_innobase::read_range_first(key_range const*, key_range const*, bool, bool)+0x1c) [0x20a955c]
/rdsdbbin/mysql/bin/mysqld(handler::multi_range_read_next(char**)+0x118) [0x10b02d8]
/rdsdbbin/mysql/bin/mysqld(handler::ha_multi_range_read_next(char**)+0x34) [0x10ac174]
/rdsdbbin/mysql/bin/mysqld(IndexRangeScanIterator::Read()+0x3c) [0x12b951c]
/rdsdbbin/mysql/bin/mysqld(IndexMergeIterator::Init()+0x130) [0x12b88f0]
/rdsdbbin/mysql/bin/mysqld(Query_expression::ExecuteIteratorQuery(THD*)+0x200) [0xf18160]
/rdsdbbin/mysql/bin/mysqld(Query_expression::execute(THD*)+0x30) [0xf185b0]
/rdsdbbin/mysql/bin/mysqld(Sql_cmd_dml::execute(THD*)+0x158) [0xea7458]
/rdsdbbin/mysql/bin/mysqld(mysql_execute_command(THD*, bool)+0xf2c) [0xe5504c]
/rdsdbbin/mysql/bin/mysqld(dispatch_sql_command(THD*, Parser_state*)+0x340) [0xe580c0]
/rdsdbbin/mysql/bin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x1560) [0xe59b40]
/rdsdbbin/mysql/bin/mysqld(do_command(THD*)+0x1e0) [0xe5ae40]
/rdsdbbin/mysql/bin/mysqld() [0xf9cc88]
/rdsdbbin/mysql/bin/mysqld() [0x2518490]
/lib64/libpthread.so.0(+0x722c) [0x40000937f22c]
/lib64/libc.so.6(+0xd46dc) [0x4000095736dc]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (407af7828030): /*context:Api::V3Controller#get_main_data dyno:web.4*/ SELECT `fort_knox_funds_flows`.* FROM `fort_knox_funds_flows` WHERE (47769504 MEMBER OF(`fort_knox_funds_flows`.`money_movements` ->> "$[*].to_user_id") OR 47769504 MEMBER OF(`fort_knox_funds_flows`.`money_movements` ->> "$[*].from_user_id"))
Connection ID (thread ID): 683209
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
[12 Jan 2023 14:00] MySQL Verification Team
HI,

Do note that your definition of the multi-valued index on JSON is incorrect.

This is described in our Reference Manual.
[12 Jan 2023 14:21] Ryan Laughlin
The table definition for `fort_knox_funds_flows` posted above is the output of the `SHOW CREATE TABLE` command, which prints out the multi-valued index in a different format than it was originally created. The original commands to create the indexes match the documentation correctly, as best I can tell:

CREATE INDEX money_movement_from_user_ids ON fort_knox_funds_flows ( (CAST(money_movements->>'$[*].from_user_id' AS SIGNED ARRAY)) );

CREATE INDEX money_movement_to_user_ids ON fort_knox_funds_flows ( (CAST(money_movements->>'$[*].to_user_id' AS SIGNED ARRAY)) );

We observed this behavior at the time and thought it was unusual.

Or are you referring to the usage of '$[*].to_user_id', as opposed to something like '$.to_user_ids'?
[12 Jan 2023 14:27] MySQL Verification Team
Hi,

Just try to define the indices  as described in our Reference Manual.

Also try 8.0.31 or 8.0.32 (soon to be out), since those have many bugs fixed in that area.
[8 Feb 2023 9:09] Rafal Staniszewski
We've upgraded 8.0.30 (percona ed.) to 8.0.32 (oracle ed.) around 2 weeks ago. Since the upgrade our database is stable. We also don't see any negative side effects.
[8 Feb 2023 13:01] MySQL Verification Team
Hi,

We are very happy to hear the good news.
[13 Mar 2023 18:26] Ryan Laughlin
For posterity (or for anyone else who finds this thread in the future): removing the multi-valued JSON index does appear to have solved our crash. We removed the index at the end of January, and have continued to run MySQL 8.0.30 without any additional crashes or restarts since then.

Sorry that we were unable to provide more specific repro steps, but hopefully this may help someone else in their debugging in the future!
[14 Mar 2023 12:46] MySQL Verification Team
There is no need to apologise for anything ......
[12 Feb 16:09] Saverio Miroddi
For everybody reading this bug, do *not* upgrade MySQL. Recent versions of MySQL have a critical bug that can cause frequent crashes (it happened to us). MVI indexes must not be used at all, until their stability has been proven for a long enough time; currently, they have very serious bugs.

@Ryan I have a couple of questions. We've had a very similar error (actually two, but one was reproducible), so I'm trying to restrict the scope if possible.

1. are you running on an ARM (Graviton) RDS instance? if the bug you've encountered is the same as ours, this may be a bug of the ARM version of MySQL.

2. are you able to inspect the logs and check if an error "CORRUPTION_IN_INNODB_TABLESPACE MYSQL_CORRUPTION_DETECTED HostManager" is logged corresponding to the time of the crash?

My conclusions until now are:

- the bug has been introduced between 8.0.28 and 8.0.30.

- the bug is related to MVI indexes

- the bug may be caused by a pointer-related issue that corrupts the tablespace

- the bug may be present only in the AWS ARM port of MySQL.
[19 Feb 11:18] MySQL Verification Team
Hi,

This is just to inform you that we have found four different internal bug reports that look similar to yours, but not exactly the same.

It is not yet known when these bugs will be scheduled, but since they are internal-only, you can not even view them.

It is also possible that all these bugs are just similar in the stacktrace, so that fixing them does not mean that it would fix this problem.

In any case, we can not do anything without a fully repeatable test case.