MySQL Bugs: #112714: Assertion failure: clone0api.cc:1159:ib::fatal in cloning when undo_x

Bug #112714	Assertion failure: clone0api.cc:1159:ib::fatal in cloning when undo_x_trunc.log
Submitted:	13 Oct 2023 12:45	Modified:	13 Oct 2023 18:11
Reporter:	Przemyslaw Malkowski	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: Clone Plugin	Severity:	S3 (Non-critical)
Version:	8.0.34	OS:	Any
Assigned to:	MySQL Verification Team	CPU Architecture:	Any
Tags:	clone plugin

Description:
When, for any reason, there is any undo space truncate log left on the receiver side, the clone plugin crashes with:

2023-10-13T12:14:21.726667Z 1 [Note] [MY-012902] [InnoDB] Undo tablespace number 1 was being truncated when mysqld quit.
2023-10-13T12:14:21.726776Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_001'.
2023-10-13T12:14:21.726803Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_002'.
2023-10-13T12:14:21.727405Z 1 [Note] [MY-013040] [InnoDB] Will create 1 new undo tablespaces.
2023-10-13T12:14:21.727419Z 1 [Note] [MY-012910] [InnoDB] Opened 1 existing undo tablespaces.
2023-10-13T12:14:21.727444Z 1 [Note] [MY-012915] [InnoDB] Created 1 undo tablespaces.
...
2023-10-13T12:14:21.785064Z 1 [ERROR] [MY-011976] [InnoDB] [FATAL] Clone File Roll Forward: Invalid File State: 10
2023-10-13T12:14:21.785078Z 1 [ERROR] [MY-013183] [InnoDB] Assertion failure: clone0api.cc:1109:ib::fatal triggered thread 139997270882048
...
2023-10-13T12:14:22.196241Z 0 [System] [MY-010116] [Server] /opt/mysql/8.0.34/bin/mysqld (mysqld 8.0.34) starting as process 2396808
2
...
2023-10-13T12:14:22.212119Z 1 [ERROR] [MY-011976] [InnoDB] [FATAL] Clone File Roll Back: Invalid File State: 10
2023-10-13T12:14:22.212136Z 1 [ERROR] [MY-013183] [InnoDB] Assertion failure: clone0api.cc:1159:ib::fatal triggered thread 139877242709760
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
2023-10-13T12:14:22Z UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
BuildID[sha1]=5b2d1735b5a07faf6ab972dffe3353817e2712d0
Thread pointer: 0x884f0a0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f37b55d1ba0 thread_stack 0x100000
/opt/mysql/8.0.34/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x2e) [0x207beee]
/opt/mysql/8.0.34/bin/mysqld(print_fatal_signal(int)+0x35f) [0xfc353f]
/opt/mysql/8.0.34/bin/mysqld(my_server_abort()+0x6e) [0xfc367e]
/opt/mysql/8.0.34/bin/mysqld(my_abort()+0xa) [0x20765da]
/opt/mysql/8.0.34/bin/mysqld(ut_dbg_assertion_failed(char const*, char const*, unsigned long)+0x312) [0x2285ba2]
/opt/mysql/8.0.34/bin/mysqld() [0x2288382]
/opt/mysql/8.0.34/bin/mysqld() [0x230f947]
/opt/mysql/8.0.34/bin/mysqld() [0x2310e03]
/opt/mysql/8.0.34/bin/mysqld() [0x2311016]
/opt/mysql/8.0.34/bin/mysqld(clone_files_error()+0x11e) [0x23111be]
/opt/mysql/8.0.34/bin/mysqld(clone_files_recovery(bool)+0x905) [0x2312695]
/opt/mysql/8.0.34/bin/mysqld(clone_init()+0x3f) [0x23129bf]
/opt/mysql/8.0.34/bin/mysqld(srv_start(bool)+0x515) [0x2242825]
/opt/mysql/8.0.34/bin/mysqld() [0x20bf0d5]
/opt/mysql/8.0.34/bin/mysqld(dd::bootstrap::DDSE_dict_init(THD*, dict_init_mode_t, unsigned int)+0x7c) [0x1e2ed6c]
/opt/mysql/8.0.34/bin/mysqld(dd::upgrade_57::do_pre_checks_and_initialize_dd(THD*)+0x190) [0x204fe60]
/opt/mysql/8.0.34/bin/mysqld() [0x107ff46]
/opt/mysql/8.0.34/bin/mysqld() [0x26e7775]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f37c2dd2609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f37c200f133]

Older MySQL 8.0.26 is failing a bit differently btw:

2023-10-13T11:49:35.345592Z 1 [ERROR] [MY-011976] [InnoDB] [FATAL] Clone File Roll Forward: Invalid File State: 10
2023-10-13T11:49:35.345780Z 1 [ERROR] [MY-013183] [InnoDB] Assertion failure: ut0ut.cc:557 thread 140464858920704

How to repeat:
I am not quite sure how to trigger a situation when the truncation logs are not removed, but when such happens at any point, the clone plugin stops working.
To reproduce, just copy such an example undo_1_trunc.log file to the receiver's datadir and then try to run the clone operation.

In my receiver's case:
$ ls -lh rsandbox_8_0_34/node1/data/undo*
-rw-r----- 1 przemek przemek 16M Oct 13 14:14 rsandbox_8_0_34/node1/data/undo_001
-rw-r----- 1 przemek przemek 16M Oct 13 14:14 rsandbox_8_0_34/node1/data/undo_002
-rw-r----- 1 przemek przemek 16K Oct 13 14:12 rsandbox_8_0_34/node1/data/undo_1_trunc.log

And after the failed clone attempt:
$ ls -lh rsandbox_8_0_34/node1/data/undo*
-rw-r----- 1 przemek przemek 16M Oct 13 14:14 rsandbox_8_0_34/node1/data/undo_001.#clone_save
-rw-r----- 1 przemek przemek 16M Oct 13 14:14 rsandbox_8_0_34/node1/data/undo_002
-rw-r----- 1 przemek przemek 16M Oct 13 14:14 rsandbox_8_0_34/node1/data/undo_002.#clone_save
-rw-r----- 1 przemek przemek 16K Oct 13 14:13 rsandbox_8_0_34/node1/data/undo_1_trunc.log

Suggested fix:
I don't see why some leftover files on the receiver side should break the restoring data capability via the clone plugin. The receiver's data are all removed anyway.

HI Mr. Malkowski,

Thank you for your bug report.

However we are not able to repeat it.

We have  tried clone operation and it worked just fine. We also do not have any similar log files that remained.

Hence, please provide step by step instructions on how to repeat the test case.

Can't repeat.

Hi MySQL Verification Team!

In order to trigger the failure, the trunc.log file must be in the receiver's data directory. Please try to use the file I attached and re-try the cloning procedure.

The fact that such an undo_1_trunc.log file remained in the datadir for some reason is another separate issue. I am sure you are aware that such a file can exist.
Regardless of the reason, though, would you agree that the existence of such a log, should not crash the receiver? The goal of the clone procedure is to re-create the whole data directory from the donor, isn't it?

Dear MySQL Verification Team,

Please find another report that should help to understand why the undo truncation logs are there on the replica's disk:
https://bugs.mysql.com/bug.php?id=112717

Hi Mr. Malkowski,

We have tried many times and were unable to create that undo log file.

We have then used your file in the datadir and tried clone operation, but it again worked just fine.

I am able to reproduce it on every test attempt. As long as I end up with undo_x_trunc.log files on a replica that runs with super_read_only=1, a later clone attempt leads to a crash. I do the clone command like this:

slave2 [localhost:22437] {msandbox} ((none)) > set global super_read_only=0; CLONE INSTANCE FROM 'donor_clone_user'@'localhost':22435 IDENTIFIED BY 'password';
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (6.16 sec)

Receiver error log:
2023-10-13T18:06:28.971362Z 1 [Note] [MY-012902] [InnoDB] Undo tablespace number 1 was being truncated when mysqld quit.
2023-10-13T18:06:28.971557Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_001'.
2023-10-13T18:06:28.971640Z 1 [Note] [MY-012902] [InnoDB] Undo tablespace number 2 was being truncated when mysqld quit.
2023-10-13T18:06:28.971738Z 1 [Note] [MY-013252] [InnoDB] Using undo tablespace './undo_002'.
2023-10-13T18:06:28.972187Z 1 [Note] [MY-013040] [InnoDB] Will create 2 new undo tablespaces.
...
2023-10-13T18:06:29.027993Z 1 [ERROR] [MY-011976] [InnoDB] [FATAL] Clone File Roll Forward: Invalid File State: 10
2023-10-13T18:06:29.028005Z 1 [ERROR] [MY-013183] [InnoDB] Assertion failure: clone0api.cc:1109:ib::fatal triggered thread 139932532709120