Description:
Performing a CLONE INSTANCE FROM, or joining an empty member into a group replication cluster, I'm experiencing what looks like success until the MySQL service is restarted.
Once restarted, all of the CLONE log rolling forward operations occur, and we reach the step where all ib_logfile* files are created. Then, all disk I/O activity stops in my data directory, and mysqld appears idle. An strace on the process shows `futex(0x7f3ac5b179d0, FUTEX_WAIT, 30683, NULL` (though that's likely not a reliable indicator of inactivity)
I'll upload the output of the mysqld.log after the clone process completes and mysqld is started for the first time.
Killing the process will ultimately result in an unstartable mysqld service:
2022-10-25T22:35:42.976290Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.28) starting as process 32386
2022-10-25T22:35:42.981377Z 0 [Note] [MY-012366] [InnoDB] Using Linux native AIO
2022-10-25T22:35:42.981529Z 0 [Warning] [MY-012358] [InnoDB] Option innodb_dedicated_server is ignored for innodb_buffer_pool_size because innodb_buffer_pool_size=87882203136 is specified explicitly.
2022-10-25T22:35:42.982275Z 0 [Note] [MY-010747] [Server] Plugin 'FEDERATED' is disabled.
2022-10-25T22:35:42.983695Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2022-10-25T22:35:42.983752Z 1 [Note] [MY-013546] [InnoDB] Atomic write enabled
2022-10-25T22:35:42.983798Z 1 [Note] [MY-012932] [InnoDB] PUNCH HOLE support available
2022-10-25T22:35:42.983827Z 1 [Note] [MY-012944] [InnoDB] Uses event mutexes
2022-10-25T22:35:42.983852Z 1 [Note] [MY-012945] [InnoDB] GCC builtin __atomic_thread_fence() is used for memory barrier
2022-10-25T22:35:42.983872Z 1 [Note] [MY-012948] [InnoDB] Compressed tables use zlib 1.2.11
2022-10-25T22:35:42.985689Z 1 [Note] [MY-013251] [InnoDB] Number of pools: 1
2022-10-25T22:35:42.985804Z 1 [Note] [MY-012951] [InnoDB] Using hardware accelerated crc32 and polynomial multiplication.
2022-10-25T22:35:42.986233Z 1 [Note] [MY-012203] [InnoDB] Directories to scan './'
2022-10-25T22:35:42.986372Z 1 [ERROR] [MY-011976] [InnoDB] [FATAL] Clone File Roll Forward: Invalid File State: 0
2022-10-25T22:35:42.986397Z 1 [ERROR] [MY-013183] [InnoDB] Assertion failure: clone0api.cc:1090:ib::fatal triggered thread 140632836507392
How to repeat:
In my case, this is just one instance out of a fleet of multiple Group Replication clusters (all running 8.0.28).
I haven't been able to repeat this in any other clusters, but my process has been either:
Method 1: On an existing MySQL server in a GR topology
======================================================
> stop group_replication;
> reset master;
> start group_replication; -- triggers clone operation
Method 2: On an empty server
=============================
> CLONE INSTANCE FROM 'replicationuser'@'donorhost':3306 IDENTIFIED BY '[redacted-password]';
In both cases I receive the same result.