Bug #110494 Deadlock between FLUSH STATUS, COM_CHANGE_USER and SELECT FROM I_S.PROCESSLIST
Submitted: 24 Mar 2023 13:04 Modified: 18 May 2023 11:09
Reporter: Dmitry Lenev (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: DML Severity:S2 (Serious)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[24 Mar 2023 13:04] Dmitry Lenev
Description:
Concurrent execution of FLUSH STATUS (or COM_STATISTICS), mysql_change_user and SELECT ... FROM INFORMATION_SCHEMA.PROCESSLIST (or old-style SHOW PROCESSLIST) sometimes leads to deadlock. This deadlock not only affects connections executing these commands, but other connections as well, it also becomes impossible to shutdown server properly.

AFAICS this deadlock is result of regression from fix for bug#32320541
"RACE CONDITION ON SECURITY_CONTEXT::M_USER". After this patch
COM_STATISTICS/FLUSH STATUS, COM_CHANGE_USER and SHOW PROCESSLIST/
SELECT ... FROM I_S.PROCESSLIST has started to acquire same locks in
different order.
    
In particular:
  1) Code responsible for changing user for connection has started to acquire
     THD::LOCK_thd_security_ctx mutex and then acquires LOCK_status mutex during
     call to THD::cleanup_connection(), without releasing the former.
  2) Implementations of COM_STATISTICS and FLUSH STATUS commands acquire
     LOCK_status mutex and then during iteration through all connections
     LOCK_thd_remove mutexes without releasing the former.
  3) Finally, SHOW PROCESSLIST/I_S.PROCESSLIST implementation acquires
     LOCK_thd_remove mutexes and then THD::LOCK_thd_security_ctx mutex
     during copying information about particular connection, without
     releasing the former.

Naturally, THD::LOCK_thd_security_ctx -> LOCK_status -> LOCK_thd_remove ->
THD::LOCK_thd_security_ctx dependency loop occasionally results in deadlock.

How to repeat:
The below test case for mysqltest suite reproduces the problem 80% of times on my laptop:

--echo # The original problem reported was that concurrent execution of
--echo # COM_STATISTICS, COM_CHANGE_USER commands and SHOW FULL PROCESSLIST
--echo # statements sometimes led to deadlock. This test uses FLUSH STATUS
--echo # statement instead of the first command and SELECT ... FROM
--echo # I_S.PROCESSLIST instead of the latter. They acquire the same
--echo # locks and were affected by the same problem.
--echo # Doing 3000 concurrent runs of each statement was enough to reproduce
--echo # the deadlock with 80% probability on my machine.

--delimiter |

CREATE PROCEDURE p_flush_status()
BEGIN
  DECLARE x INT DEFAULT 3000;
  WHILE x DO
    SET x = x-1;
    FLUSH STATUS;
  END WHILE;
END |

CREATE PROCEDURE p_processlist()
BEGIN
  DECLARE x INT DEFAULT 3000;
  WHILE x DO
    SET x = x-1;
    SELECT COUNT(*) INTO @a FROM information_schema.processlist;
  END WHILE;
END |

--delimiter ;

--enable_connect_log
--connect (con1, localhost, root,,)
--echo # Send:
--send CALL p_flush_status()

--echo # Send:
--connect (con2, localhost, root,,)
--send CALL p_processlist()

--connection default

--echo # Execute COM_CHANGE_USER command 3000 times.
let $i = 3000;
while ($i)
{
  dec $i;
--change_user
}

--connection con1
--echo # Reap p_flush_status().
--reap
--disconnect con1
--source include/wait_until_disconnected.inc

--connection con2
--echo # Reap p_processlist().
--reap
--disconnect con2
--source include/wait_until_disconnected.inc

--connection default
--disable_connect_log
DROP PROCEDURE p_flush_status;
DROP PROCEDURE p_processlist;

Suggested fix:
I am planning to contribute straightforward patch addressing this issue.
[24 Mar 2023 14:05] MySQL Verification Team
Hi Mr. Lenev,

Thank you for your bug report.

We have managed to repeat it on the latest 8.0.

This report is now a verified bug.
[24 Mar 2023 14:06] MySQL Verification Team
Hi Mr. Lenev,

We are eagerly waiting on your patch.

Many thanks in advance.
[24 Mar 2023 17:51] Dmitry Lenev
A simple fix for 110494 problem.

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: ps-8683-bug110494-contrib.patch (text/x-patch), 7.08 KiB.

[27 Mar 2023 12:28] MySQL Verification Team
Thank you, Mr. Lenev, for your contribution.
[18 May 2023 11:09] Edward Gilmore
Posted by developer:
 
Added the following note to the MySQL Server 8.0.34 and 8.1.0 release notes:

Concurrent execution of FLUSH STATUS, COM_CHANGE_USER, and SELECT FROM I_S.PROCESSLIST could result in a deadlock. 
A similar issue was observed for concurrent execution of COM_STATISTICS, COM_CHANGE_USER, and SHOW PROCESSLIST.

Our thanks to Dmitry Lenev for the contribution.