Bug #111858 mysql.db's MDL acquired by XA recover then blocked initializion of ACL
Submitted: 24 Jul 2023 8:29 Modified: 24 Jul 2023 13:24
Reporter: Fan Lyu Email Updates:
Status: Unsupported Impact on me:
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.33 OS:Any
Assigned to: CPU Architecture:Any

[24 Jul 2023 8:29] Fan Lyu
1.Doing some MDL to system table mysql.db.
2.use kill -9 to simulate abnormal crash
3.restart, logs report there is XA prepared transaction.
4.acl_init hang when acquiring table MDL lock of mysql.db  

We inspect the stack and make sure in the MDL_context of acl_init, MDL of mysql.db is in waiting list.It want to acquire lock type of [shared_read_only]

Then we insepect the global variable of  MDL_context_backup_manager, inspect the corresponding m_ticket_store.m_durations[1].m_ticket_list,found that it has already acquired lock in XA recovery process, which is prior to acl_init.
The acquired MDL of same table mysql.db is [shared write]

[shared write]  [shared_read_only] are exclusive so acl_init hang.

How to repeat:
Include mysql.db 's MDL in prepared XA transaction, crash mysqld with kill -9.
restart ,make sure mysql.db is included in XA prepared transactions of recovery.
[24 Jul 2023 13:21] MySQL Verification Team
Hi Mr. Lyu,

Thank you for your bug report.

However, we are not able to repeat your problem.

We also have an idea on why we can not repeat it.

We are using a full ACID configuration, which prevents any corruption of data with kill -9 PID.

For example, cacheing of the disk partition is disabled, cacheing of the filesystem is disabled, cacheing of the  files is disabled and InnoDB SE is set to the 100 % ACID safe settings, as described in our Reference Manual.

Hence, with our setup, there is no loss of data with kill -9, which is what we recommend in our Reference Manual.

Can't repeat.
[24 Jul 2023 13:24] MySQL Verification Team

We also have to inform you that using MDL commands on the system tablespaces is not supported, as explained in the Reference Manual.

Hence, we had to change the  setting  of the status for this report to the correct one.
