Bug #111594 | Some Scalability Issues in Specific Scenarios | ||
---|---|---|---|
Submitted: | 28 Jun 2023 8:21 | Modified: | 20 Sep 2023 9:14 |
Reporter: | changyan liu (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S3 (Non-critical) |
Version: | 8.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[28 Jun 2023 8:21]
changyan liu
[28 Jun 2023 9:11]
changyan liu
Simplify synopsis
[28 Jun 2023 9:11]
changyan liu
Simplify synopsis
[28 Jun 2023 13:17]
MySQL Verification Team
Hi Mr. Liu, Thank you for your bug report. However, this is not a bug. Actually, reaching 220 threads without a slowdown is a miracle in itself. If you had a machine with 256 cores, that would also be expected result, since there are so many locks, mutex barriers, atomic waits and waits on many other conditions that further improvements would not be possible. If you manage to get the same results with a machine containing 512 cores, then we would need from you the output from the InnoDB status, at the peak of the benchmark, so that we might find a bottleneck. Not a bug.
[29 Jun 2023 5:55]
changyan liu
Hello, MySQL Verification Team Glad to receive your response, but based on this question, we has found the root cause is false sharing problem: https://en.wikipedia.org/wiki/False_sharing. Be brief: If multiple variables belong to the same cacheline and are modified at the same time in a concurrent environment, only one thread can operate the cacheline at the same time due to write barriers and memory consistency protocols, and the performance will decrease because of competition. Especially when these variables in one struct, and there is strategy to separate them: for example, add a pading between these two variables,like this patch:https://github.com/mysql/mysql-server/compare/8.0...liuchangyan:mysql-server:Add-pading-to... After adding this patch, we could see the benchmark result is: Threads count: 20 average result: 40378.93 Threads count: 40 average result: 49117.21 Threads count: 60 average result: 53685.71 Threads count: 80 average result: 54501.64 Threads count: 100 average result: 53333.08 Threads count: 120 average result: 55605.55 Threads count: 140 average result: 61526.80 Threads count: 160 average result: 62951.74 Threads count: 180 average result: 64188.84 Threads count: 200 average result: 65926.29 Threads count: 220 average result: 65609.99 Threads count: 224 average result: 65593.10
[29 Jun 2023 6:00]
changyan liu
Of course, We try our best to keep the above performance benchmark results accurate, but there are still some differences in the results of each benchmark, hope you could understand:).
[29 Jun 2023 8:48]
changyan liu
In my view, this is a bug, two field A and B in one struct are stored in one cacheline, and field A is accessed frequently but when A is being modified,other field have to wait. As the number of threads increase, this cacheline occupy problem will get worse. Add pading may apart them to different cachelines in order to improve performance as threads increase. The performance comparison result is attached to the file.
[29 Jun 2023 8:49]
changyan liu
This is performance comparison result after add pading
Attachment: pc.png (image/png, text), 395.82 KiB.
[29 Jun 2023 8:49]
changyan liu
This is performance comparison result after add pading
Attachment: pc.png (image/png, text), 395.82 KiB.
[29 Jun 2023 12:41]
MySQL Verification Team
Hi Mr. liu, We can not control how CPU caching and locking work. So, different CPU's behave differently in many respects, including this one. That is not our bug. That is not a part of our code. We do not make hardware on which our server is run. Cache padding is not defined in our source code, nor could it be defined. Even if that was possible, we would have to write a separate code for each different make and model of CPU that we support. And we support a good number of them. ' Padding of the structure members is completely in the control of C and C++ compilers. We do not write our server in the assembly. Not a bug.
[30 Jun 2023 11:08]
changyan liu
Hello, MySQL Verification Team Thanks for your reply:) We've got your point, this may not be a bug in mysql innoDB engine, could this question be considered a feature so that we can have the opportunity to improve the performance of the innoDB storage engine? We've found that currently mysql has adapted to different cacheline sizes: https://bugs.mysql.com/bug.php?id=98499 and our cacheline padding will support different architecture of CPU. Do you have any suggestions on this part? thks
[30 Jun 2023 13:44]
MySQL Verification Team
Hi Mr. liu, We fully understand what you are referring to. This could indeed be a feature request. But , in that case, please refer to the bug that you mentioned on what should a feature request look like. You have to provide an example of code that would demonstrate, on one example only, how would that be accomplished. At this moment, it is not known to us how to make different paddings for different cachelines. Especially, for the number of CPU's that we support. Last, but not least, 8.0 is in the maintenance mode only. That means that (fully and precisely defined) feature request, with code examples, can be accepted for 8.1, once when it comes out. We do not know yet what will that happen.
[6 Jul 2023 3:02]
changyan liu
Hello, MySQL Verification Team Thanks for your reply. Sure, before we submit this patch, we will describe the function of our code in detail, the relationship with the size of the cacheline in different CPU architecture and how that improve performance in the case of increasing concurrency.
[6 Jul 2023 12:19]
MySQL Verification Team
Hi Mr. liu, We are eagerly waiting on your patch ......
[28 Aug 2023 2:38]
changyan liu
Hi, this is my patch: add pading to solve false sharing problem
Attachment: Bug#111594.patch (application/octet-stream, text), 1.42 KiB.
[28 Aug 2023 2:39]
changyan liu
(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
[12 Sep 2023 3:39]
changyan liu
Hello, MySQL Verification Team I would like to know if there is any problem with my patch currently? The results of this PR https://github.com/mysql/mysql-server/pull/493 is based on mysql 8.1.0 version. If there are some problems of my patch, please contact me anytime. sincerely changyan
[18 Sep 2023 10:37]
MySQL Verification Team
Thank you Mr. liu, Your patch is fine and this bug report is now verified.
[18 Sep 2023 12:53]
Marc ALFF
Thanks for: - the original bug report, - the benchmarks, - the analysis, - identifying a root cause in innodb, - the patch contributed. Based on the patch, the false sharing issue found is related to the innodb storage engine. Changing the bug category from performance schema to innodb.
[20 Sep 2023 9:14]
changyan liu
Hello, MySQL Verification Team This is my first time to submit a patch to the mysql community. I would like to know if this patch could be accepted or if there is anything wrong with it. Looking forward to your reply. regards, Changyan
[7 Nov 2024 14:44]
Bin Wang
From the tests, it seems that low-concurrency performance has actually decreased. Low-concurrency performance is more important than high-concurrency performance. It's surprising that this line has such a significant impact on low-concurrency performance.
[7 Nov 2024 14:46]
Bin Wang
We can't just look at the data—do you have the perf analysis from that time? A decrease in low-concurrency performance alongside an increase in high-concurrency performance isn't ideal and could very likely be affected by other bottlenecks.
[7 Nov 2024 14:48]
MySQL Verification Team
Thank you, Mr. Wang, Your thoughts will be forwarded to our internal bug record.