Bug #116323 Waiting for MDL lock during COM_CHANGE_USER may cause connection leaks
Submitted: 9 Oct 2024 13:32 Modified: 30 Nov 2024 16:02
Reporter: Ruyi Zhang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Connection Handling Severity:S2 (Serious)
Version:8.0.39, 8.0.40 OS:Any
Assigned to: CPU Architecture:Any

[9 Oct 2024 13:32] Ruyi Zhang
Description:
Our team (ByteNDB) recently encountered a Too many connections error. 

Even if we closed all connections, MySQL still refused the client connection due to Too many connections. 

This is caused by complex lock waiting. 

If mysql_change_user(COM_CHANGE_USER) cannot obtain the MDL lock of the database immediately, it will hold the THD::LOCK_thd_security_ctx lock for a long time. 

If show processlist is run at this time, show processlist will be blocked by THD::LOCK_thd_security_ctx, and it holds Global_THD_manager::LOCK_thd_remove lock, which prevents the connection from being disconnected (Global_THD_manager::remove_thd).

Causes thread_connected to leak.

How to repeat:
We can reproduce this problem with the help of MTR and additional commands, MTR:

```
connect(con1, localhost, root,,);
connect(con2, localhost, root,,);
connect(con3, localhost, root,,);

--connection default
CREATE USER user2;
CREATE DATABASE my_db;
CREATE TABLE my_db.a(id int primary key);
GRANT ALL ON my_db.* TO user2;
--connection con1
begin;
select * from my_db.a;
--connection default
send drop database my_db;
sleep 2;
--connection con2
--change_user user2,,my_db

```

When the above MTR is run to ```--change_user user
2,,my_db``` (waiting because of MDL lock), we connect to mysqld1 and execute show processlist (expected to be blocked by mysql_change_user). 

After that, we used another client to frequently(more than 8000 times) connect and exit mysqld1. Then executing ```show global status like 'Threads_connected'```, the value will continue to grow and leak;

Suggested fix:
There are requests for other locks in ```acl_authenticate``` that may cannot be obtained immediately, such as Database MDL locks. 

Maybe we should reduce the scope of THD::LOCK_thd_security_ctx in ```acl_authenticate```.
[30 Nov 2024 16:02] MySQL Verification Team
Hello Zhang Ruyi,

Thank you for the report and test case.

regards,
Umesh