Bug #106222 install/uninstall plugin concurrent with new connections my be deadlock
Submitted: 20 Jan 2022 8:28 Modified: 21 Jan 2022 14:18
Reporter: zkong kong Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Connection Handling Severity:S3 (Non-critical)
Version:5.7.37, 8.0.28 OS:Linux
Assigned to: CPU Architecture:ARM

[20 Jan 2022 8:28] zkong kong
Description:
Install or uninstall plugins may prevent creating new connections and It's more likely happen on arm platform. After analyze the stacks I found the deadlock cycle:

thd 1: in THD::init hold LOCK_global_system_variables and aquire LOCK_plugin

void plugin_thdvar_init(THD *thd, bool enable_plugins) {
  ... ...
  
  mysql_mutex_lock(&LOCK_global_system_variables);
  
  ... ...

  if (enable_plugins) {
    mysql_mutex_lock(&LOCK_plugin);

thd 2: in uninstall plugin, install is the same:
       hold LOCK_plugin and aquire LOCK_system_variables_hash

static void reap_plugins(void) {
   ... ...
   mysql_mutex_lock(&LOCK_plugin);

  while ((plugin = *(--reap))) plugin_del(plugin); // lock LOCK_system_variables_hash

tatic void plugin_del(st_plugin_int *plugin) {
  ... ...
  mysql_rwlock_wrlock(&LOCK_system_variables_hash);

thd 3: hold LOCK_system_variables_hash and aquire LOCK_global_system_variables
       thd_prepare_connection
       ->prepare_new_connection_state
       ---> alloc_and_copy_thd_dynamic_variables

void alloc_and_copy_thd_dynamic_variables(THD *thd, bool global_lock) {
  mysql_rwlock_rdlock(&LOCK_system_variables_hash);

  if (global_lock) mysql_mutex_lock(&LOCK_global_system_variables);

  

How to repeat:
read the source code
[20 Jan 2022 13:51] MySQL Verification Team
Hi Mr. kong,

Thank you for your bug report.

However, your report is not complete.

First of all, you claim that loading a  plugin prevents new connections to be established. This is expected behaviour. While plugin is loading or unloading, new connections can not be established.

However, further on, you claim that deadlock occurs. It means that many threads or the entire server will be blocked in the deadlock.

In order to verify that possibility, we need a fully repeatable test case so that we could witness a deadlock  it in vivo. Next, please prove why can it only happen on ARM !!! Also, can it happen on macOS ARM or only on Linux ARM ????

We are expecting answers for all of our questions.
[21 Jan 2022 13:55] zkong kong
First of all, you claim that loading a  plugin prevents new connections to be established. This is expected behaviour. While plugin is loading or unloading, new connections can not be established.

However, further on, you claim that deadlock occurs. It means that many threads or the entire server will be blocked in the deadlock.

----> Yes, not hold a while, the client can't login so I suspect it's deadlock and review the stacks found the lock cycle above. Any new connection will wait in THD::init?

In order to verify that possibility, we need a fully repeatable test case so that we could witness a deadlock  it in vivo. Next, please prove why can it only happen on ARM !!! Also, can it happen on macOS ARM or only on Linux ARM ????

----> The lock cycle is from the stacks we encountered in our test environment which is linux arm。I can only find the lock cycle now, if must be  reproduced need add some sync points.
[21 Jan 2022 14:18] MySQL Verification Team
Hi,

Thank you for your answer.

Please, separate out answers to your responses as your comments are quite unreadable in this format.

We shall wait on your fully reproducible test case in order to proceed with the processing of this report.

Do note that we asked you some other questions on which we did not receive any answers.

For the time being, we can't repeat your report.