Bug #113516 rejoin cluster failure as clone plugin is waiting for table metadata lock
Submitted: 27 Dec 2023 3:00 Modified: 27 Dec 2023 5:12
Reporter: Zhejun Cai Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Clone Plugin Severity:S3 (Non-critical)
Version:8.0.32, 8.0.35 OS:Any
Assigned to: CPU Architecture:Any

[27 Dec 2023 3:00] Zhejun Cai
Description:
Start group_replication to rejoin cluster failure as clone plugin is waiting for table metadata lock

46	mysql.session	localhost	NULL	Query	71	Waiting for table metadata lock	PLUGIN: DROP TABLE `test`.`t1`

TABLE   test    t1      NULL    139876890984720 EXCLUSIVE       TRANSACTION     PENDING sql_parse.cc:6093       88      138596
TABLE   test    t1      NULL    139876817266448 SHARED_WRITE    TRANSACTION     GRANTED mdl.cc:3694     57      4513

How to repeat:
1.See the steps in the uploaded test case gr_xa_clone_rejoin_failure.test and Run it.
2.Check the output of peformance_schema.metadata_locks in the logfile

./mtr gr_xa_clone_rejoin_failure.test
Logging: ./mtr  gr_xa_clone_rejoin_failure.test
MySQL Version 8.0.32
Checking supported features
Using 'all' suites
Collecting tests
Checking leftover processes
Removing old var directory

gr_xa_clone_rejoin_failure.log shows:
**** SHOW PROCESSLIST on server_2 ****
46      mysql.session   localhost       NULL    Query   70      Waiting for table metadata lock PLUGIN: DROP TABLE `test`.`t1`

select * from performance_schema.metadata_locks;
TABLE   test    t1      NULL    140632676629552 EXCLUSIVE       TRANSACTION     PENDING sql_parse.cc:6093       88      138690
TABLE   test    t1      NULL    140632328857360 SHARED_WRITE    TRANSACTION     GRANTED mdl.cc:3694     57      4511

**** SELECT * FROM performance_schema.threads on server_2 ****
88      thread/group_rpl/THD_plugin_server_session      FOREGROUND      46      root    localhost       NULL    Query   70      Waiting for table metadata lock NULL    87      NULL    YES     YES     Plugin  1337840 SYS_default     PRIMARY 16720   29056   3276218 3366295

**** replication_group_members on server_2 ****
MEMBER_STATE    RECOVERING

Suggested fix:
WL#9335: Enable MDL Locking for Recovered and Detached Prepared XA Transactions
https://dev.mysql.com/worklog/task/?id=9335

According to the WL#9335 and the uploaded testcase, Detached Prepared XA Transactions hold SHARED_WRITE MDL lock, But Clone also needs EXCLUSIVE MDL lock to drop table when node rejoin to cluster

Is this a bug? 
Otherwise, it should be stated in the document that it is a limitation of XA and clone, even though this is very unfriendly to use.
[27 Dec 2023 5:12] MySQL Verification Team
Hello Zhejun Cai,

Thank you for the report and test case.
Verified as described.

regards,
Umesh