Bug #112302 | Concurrent PROCESSLIST, PURGE BINLOG Blocks New Connections | ||
---|---|---|---|
Submitted: | 9 Sep 2023 0:38 | Modified: | 21 Sep 2023 2:31 |
Reporter: | Jervin Real | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 5.7.40 | OS: | Any |
Assigned to: | MySQL Verification Team | CPU Architecture: | Any |
[9 Sep 2023 0:38]
Jervin Real
[12 Sep 2023 14:09]
MySQL Verification Team
Hi, I cannot reproduce this behavior no matter what I try.
[20 Sep 2023 1:31]
Jervin Real
We've managed to reproduce this with the following, perhaps this can be simplified. We're using an m5.16xlarge instance. Create a sysbench workload from one screen session. /usr/local/bin/sysbench --mysql-host=<MYSQL_HOST> --mysql-port=3306 --mysql-user=username --mysql-password=password \ --tables=32 --table-size=160000 --mysql-db=test --threads=512 \ --events=0 --time=5 --report-interval=1 --histogram=on --thread-init-timeout=3600 \ --db-ps-mode=disable --rate=1000 oltp_write_only prepare while true ; do ( /usr/local/bin/sysbench --mysql-host=<MYSQL_HOST> --mysql-port=3306 --mysql-user=username --mysql-password=password \ --tables=32 --table-size=160000 --mysql-db=test --threads=512 \ --events=0 --time=5 --report-interval=1 --histogram=on --thread-init-timeout=3600 \ --db-ps-mode=disable --rate=1000 oltp_write_only run & ) ; sleep $((1 + RANDOM % 8)) done On another screen window, run mysqlslap - I managed to used the test case from https://bugs.mysql.com/bug.php?id=89101 successfully. Once the mysqlslap workload starts, execute one or more `SHOW PROCESSLIST` from another session. This will stall until the mysqlslap connections are established. Theoretically, if you overlap multiple mysqlslap sessions continously, the engine will show as frozen until it is not able to accept new connections. We have tried the same test case on 5.7.42 and it is not repeatable, perhaps the fix on 5.7.41 below fixed this issue given the mutex on LOCK_thd_data. Bug #34857411 : regression - slow connections/telnet block many other statements for connect_timeout seconds, causing pileups Description: ------------ This is a regression caused due to the fix made for the Bug 34094706. When a connection somehow stalls/blocks during the authentication phase, where a mutex is held, the other connections that are executing queries on I_S and P_S are blocked until the first connection release the mutex. Fix: ---- Instead of using the mutex and checking the thd->active_vio, we now check the value of net.vio type in the is_secure_transport() check. Change-Id: Ic301a3a69b2bb828f2323441bdd0232e5a0ecef4
[20 Sep 2023 1:38]
Jervin Real
Custom parameters: +------------------------------+-----------------------------------------------+ | variable_name | variable_value | |------------------------------+-----------------------------------------------| | innodb_io_capacity | 1024 | | innodb_lock_wait_timeout | 120 | | innodb_log_buffer_size | 16777216 | | innodb_log_file_size | 4294967296 | | innodb_print_all_deadlocks | 1 | | long_query_time | 3 | | max_allowed_packet | 104857600 | | max_connections | 12000 | | max_user_connections | 6000 | | performance_schema | 1 | | slave_parallel_workers | 8 | | slow_query_log | 1 | | table_open_cache_instances | 16 | | gtid-mode | OFF_PERMISSIVE | | innodb_buffer_pool_instances | 16 | | slave_parallel_type | LOGICAL_CLOCK | | slave_preserve_commit_order | 1 | +------------------------------+-----------------------------------------------+
[20 Sep 2023 1:42]
Jervin Real
Just want to mention https://bugs.mysql.com/bug.php?id=73168 since the weight of holding LOCK_thd_data is likely to have an impact here as well.
[20 Sep 2023 11:40]
MySQL Verification Team
Hi, thanks for the update. While I'm retrying this - the "show processlist" is the actual thing that blocks? Have you tried querying I_S directly? https://dev.mysql.com/doc/refman/8.0/en/information-schema-processlist-table.html
[20 Sep 2023 17:56]
Jervin Real
Hey thanks for looking. SHOW PROCESSLIST is one of the queries that blocks, but not only that - this instance is monitored by Percona Monitoring and Management so there are other PROCESSLIST queries including directly to I_S.PROCESSLIST. In some cases, I can throw in PURGE BINARY LOGS and that gets blocked too as shown in the pmps.
[20 Sep 2023 18:31]
Jervin Real
This is also repeatable in 8.0.31 but not in 8.0.34.
[20 Sep 2023 19:45]
MySQL Verification Team
Hi, I managed to reproduce with 5.7 and not with 8.0 nor 8.1 and only with show processlist, I did not manage to reproduce select from i_s to stall. I'm not sure it is the same issue as we did have instances of show processlist stall on "heavy loaded server" and that was declared as not-a-bug by the dev team. I'll verify this but since I cannot reproduce this on 8 not sure if the patch is small enough that it can go to 5.7
[20 Sep 2023 20:37]
Jervin Real
Can you confirm if you reproduced this on 5.7.40? How about 5.7.42/43?
[20 Sep 2023 21:02]
MySQL Verification Team
I reproduced on 5.7.40, I did not reproduce on any 8.x (8.1.0, 8.0.34). I am testing it as we speak with 5.7.43, did not reproduce yet, if I do not manage to reproduce with 5.7.43 I will close the bug as solved.
[21 Sep 2023 2:31]
MySQL Verification Team
Hi, I cannot reproduce on any modern version of MySQL so I'm setting this bug to can't reproduce. I believe this issue is solved in latest versions of MySQL
[11 Jun 7:34]
lei yue
I followed the above method and the problem did not reoccur in 8.0.24. The execution order of the three windows is as follows: 1. sysbench run 2. mysqlslap run 3. show processlist
[11 Jun 7:57]
lei yue
Sorry, closing the thread pool can reproduce