Bug #115399 performance deterioration caused by incorrect cpu usage statistics
Submitted: 21 Jun 2024 3:40 Modified: 30 Oct 2024 1:35
Reporter: yuanyue Zheng Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.x, 8.0.37, 8.4.0 OS:Any
Assigned to: CPU Architecture:Any

[21 Jun 2024 3:40] yuanyue Zheng
Description:
In the srv_update_cpu_usage function, MAX_CPU_N is set to 128, if mysqld runs on processor_id >=128, srv_cpu_usage.utime_pct will not be updated, which renders the log_max_spins_when_waiting_in_user_thread function ineffective, leading cpu always spin even cpu is busy.

How to repeat:
1. run mysqld with taskset and bind it to processor_id >=128
    taskset -c 128-255 mysqld --defaults-file=/etc/my.cnf &
2. run sysbench/tpcc
3. perf top shows `log_write_up_to` is using the most CPU time, and the performance is poor.

Suggested fix:
diff --git a/storage/innobase/srv/srv0srv.cc.bak b/storage/innobase/srv/srv0srv.cc
index 40cea6d..4fab2ae 100644
--- a/storage/innobase/srv/srv0srv.cc.bak
+++ b/storage/innobase/srv/srv0srv.cc
@@ -2105,7 +2105,7 @@ static void srv_update_cpu_usage() {
   }

   int n_cpu = 0;
-  constexpr int MAX_CPU_N = 128;
+  const long int MAX_CPU_N = sysconf(_SC_NPROCESSORS_CONF);
   for (int i = 0; i < MAX_CPU_N; ++i) {
     if (CPU_ISSET(i, &cs)) {
       ++n_cpu;
[24 Jun 2024 14:09] MySQL Verification Team
Hello yuanyue Zheng,

Thank you for the report and feedback.

regards,
umesh
[25 Jun 2024 10:05] Magnus BlÄudd
In sched.h there is a CPU_COUNT() macro, might be better to use that if it's portable.
[5 Sep 2024 8:36] Jakub Lopuszanski
Posted by developer:
 
Shouldn't we rather use CPU_SETSIZE ?
https://www.gnu.org/software/libc/manual/html_node/CPU-Affinity.html
[30 Oct 2024 1:35] Philip Olson
Posted by developer:
 
Fixed as of the upcoming MySQL Server 8.0.41, 8.4.4, and 9.2.0 releases, and here's the proposed changelog entry from the documentation team:

CPU usage statistics did not account for a processor count over 128,
which could degrade performance on these larger systems.

Thank you for the bug report.