Bug #111538 | MySQL 8.0.33 is slower than MySQL 8.0.28 with queries using JOINS | ||
---|---|---|---|
Submitted: | 23 Jun 2023 0:17 | Modified: | 24 Jul 22:10 |
Reporter: | Marcos Albe (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S5 (Performance) |
Version: | 8.0.33 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | performance degradation |
[23 Jun 2023 0:17]
Marcos Albe
[23 Jun 2023 0:50]
Marcos Albe
I actually tested with much simpler query, and still visible: 8.0.28 +----------+------------+--------------------------------------------------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+--------------------------------------------------------------------------------------------+ | 21 | 0.67210475 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 22 | 0.73221925 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 23 | 0.64774975 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 24 | 0.67075575 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 25 | 0.67332025 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 26 | 0.66537300 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 27 | 0.67452900 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 28 | 0.72930675 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 29 | 0.71833350 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 30 | 0.65857525 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 31 | 0.65274575 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 32 | 0.69768975 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 33 | 0.69682700 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 34 | 0.72477825 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 35 | 0.64526125 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | +----------+------------+--------------------------------------------------------------------------------------------+ +----------------------------+--------+ | Variable_name | Value | +----------------------------+--------+ | Handler_read_key | 105218 | | Handler_read_next | 999999 | | Handler_read_rnd_next | 105217 | +----------------------------+--------+ +----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+ | 1 | SIMPLE | e | NULL | ALL | NULL | NULL | NULL | NULL | 299069 | 100.00 | NULL | | 1 | SIMPLE | s | NULL | ref | PRIMARY | PRIMARY | 4 | employees.e.emp_no | 9 | 100.00 | NULL | +----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+ 8.0.33 +----------+------------+--------------------------------------------------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+--------------------------------------------------------------------------------------------+ | 21 | 0.85898000 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 22 | 0.79197600 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 23 | 0.83340700 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 24 | 0.77812325 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 25 | 0.77719400 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 26 | 0.81970475 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 27 | 0.76522925 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 28 | 0.79419200 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 29 | 0.80916500 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 30 | 0.82678150 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 31 | 0.86895875 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 32 | 0.83041300 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 33 | 0.80798750 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 34 | 0.81469850 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | | 35 | 0.79329575 | SELECT * FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 | +----------+------------+--------------------------------------------------------------------------------------------+ +----------------------------+--------+ | Variable_name | Value | +----------------------------+--------+ | Handler_read_key | 105218 | | Handler_read_next | 999999 | | Handler_read_rnd_next | 105217 | +----------------------------+--------+ +----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+ | 1 | SIMPLE | e | NULL | ALL | NULL | NULL | NULL | NULL | 292025 | 100.00 | NULL | | 1 | SIMPLE | s | NULL | ref | PRIMARY | PRIMARY | 4 | employees.e.emp_no | 9 | 100.00 | NULL | +----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+ So, same explain, same number of rows read but 8.0.28 averaged 0.6839712833 while 8.0.33 average was 0.8113404167, so again 18% more
[23 Jun 2023 0:53]
Marcos Albe
Run on employees database
Attachment: repro.sql (application/octet-stream, text), 1.74 KiB.
[23 Jun 2023 1:05]
Marcos Albe
Flamegraph for 8.0.28
Attachment: 8.0.28.markus.svg (image/svg+xml, text), 59.48 KiB.
[23 Jun 2023 1:05]
Marcos Albe
Flamegraph for 8.0.33
Attachment: 8.0.33.markus.svg (image/svg+xml, text), 65.53 KiB.
[23 Jun 2023 9:12]
MySQL Verification Team
Hello Marcos, Thank you for the report and feedback. Thanks, Umesh
[23 Jun 2023 11:44]
Marcos Albe
Seems is not related to joins after all, as single-table queries also see like 20% drop in performance: 8.0.28: mysql [localhost:8028] {msandbox} (employees) > select sum(salary) from salaries; +--------------+ | sum(salary) | +--------------+ | 181480757419 | +--------------+ 1 row in set (0.43 sec) mysql [localhost:8028] {msandbox} (employees) > select sum(salary) from salaries; +--------------+ | sum(salary) | +--------------+ | 181480757419 | +--------------+ 1 row in set (0.43 sec) mysql [localhost:8028] {msandbox} (employees) > select sum(salary) from salaries; +--------------+ | sum(salary) | +--------------+ | 181480757419 | +--------------+ 1 row in set (0.43 sec) .vs. 8.0.33: mysql [localhost:8033] {msandbox} (employees) > select sum(salary) from salaries; +--------------+ | sum(salary) | +--------------+ | 181480757419 | +--------------+ 1 row in set (0.53 sec) mysql [localhost:8033] {msandbox} (employees) > select sum(salary) from salaries; +--------------+ | sum(salary) | +--------------+ | 181480757419 | +--------------+ 1 row in set (0.52 sec) mysql [localhost:8033] {msandbox} (employees) > select sum(salary) from salaries; +--------------+ | sum(salary) | +--------------+ | 181480757419 | +--------------+ 1 row in set (0.52 sec)
[21 Nov 2023 10:34]
Satya Bodapati
I debugged this issue as part of https://jira.percona.com/browse/PS-8822 I found that the regression causing commit is https://github.com/mysql/mysql-server/commit/b11a175924194d574238f42068f09b15924ae2f8 Bug #16739204 IMPROVE THE INNODB HASH FUNCTION Bug #23584861 INNODB ADAPTIVE HASH INDEX USES A BAD PARTITIONING ALGORITHM FOR THE REAL WORLD The InnoDB hash and random generator functions were improved in most important properties. The hash methods are all changed to work on 64bit values instead of platform dependent type. Some were based on ideas from Tomek Czajka's blog https://sortingsearching.com/2020/05/21/hashing.html The old `ut_hash_ulint` was calculating hash using only one xor with a 32bit value, so the hashes of values of multiple of 2^n had all the lowest bits always the same. It is replaced with a method that calculates 64bit out of 64bit value. The Tabulation Hashing is used as the main hashing algorithm. It has good random distribution of common input sequences, a property that is guarded by a new unit tests. The `ut_rnd_gen_ulint` and its helper `ut_rnd_gen_next_ulint` were providing numbers using 32bit constants and many, many operations. This generator provided random values in a cycle of just 85000. It was replaced by a simple function that provides fully valuable 64bits of hash with a new method to hash 64bit integers, resulting in 2^64 long cycle. Hashing of two 64bit integers was improved in distribution for common sequences like <i,i>, <i,constant>, <constant, i>. Property guarded by a new unit test. ut_fold_binary was renamed to `ut::hash_binary_ib` and used only for InnoDB's page checksum calculation for a non-default checksum algorithm. All other usages switched to a new method `ut::hash_binary` which bases on method for hashing two 64bit integers. For buffers longer than 15 bytes it switches to calculate hash using crc32 (and doing some more hashing to get 64bit result), which after recent improvements by Jakub Lopuszanski using native CPU instructions and execution pipeline parallelism to achieve very high speeds (multiple bytes per core per each single cycle). Overall the new method is much faster than old one. The hash index calculation from fold value is done using a new utility class `fast_modulo_t` designed and implemented by Jakub. It does more costly precalculations once, and then allow fast modulo operations without using very slow integer division CPU instructions. It's a little more than 3 times faster and calculating it with division and only 2 times slower than GCC's compile-time optimized modulo by a constant. It also has a wrapper that allows calculations while the modulo value is changed concurrently using lock-free `Seq_lock` also implemented by Jakub. A new `ut0math.h` is added with strictly math-related functions. The old `ut_find_prime` is extracted to it, and two new `divide_128` and `multiply_uint64` added. It also adds a new utility class `fast_modulo_t` and `mt_fast_modulo_t`. The `multiply_uint64` which calculated 128bit result of 64bit multiplication is executed very frequently, so it has intrinsic methods for x86 implemented. Additional or more detailed changes: hash_create and hash_table_free were deleted. The hash initialization was moved to `hash_table_t` constructor. The hash destruction is done automatically using default destructor that clears `cells` array as it is `unique_ptr` now. `hast_table_t` get new field `n_cells_fast_modulo` to allow fast hash index from fold calculation. `btr_get_search_slot` is added to be responsible for the AHI part index calculation using global fast modulo structure `btr_ahi_parts_fast_modulo`. `ut_rnd_interval` did return results excluding the value of `high`, contrary to the documentation. New unit tests were added to test many new or modified hashing and random generating methods, assuring long generator cycle and good random distributions of hashes. RB#27454 Change-Id: Icbc62d9e2ca44a702c18ea66e2ca048d9240494a
[21 Nov 2023 13:08]
MySQL Verification Team
This reminds me of the internal bug I filed which directly related to the hashing algorithm by adaptive hash index. Bug 35449386 - particular query performs much worse in 8.0 than 5.7 which has got a patch coming in 8.0.36. So please wait and check if that solves it.
[21 Nov 2023 13:20]
Satya Bodapati
Thanks for the reply. I see that non AHI paths are also affected. This is because of the hashing algorithm used by buffer pool is also changed as part of the above mentioned commit.
[24 Nov 2023 21:24]
Zongzhi Chen
I thought and I can confirm that this regression cause by the Feature Instant INSTANT ADD and DROP Column(s). This commit turns some hot patch function from inline to non-inline. And add some function that is not inline, even the function is define as inline. I have change the code in 8.2.0 version, and it get almost the some performance as 8.0.27. since I just want to confirm the regression cause by inline function, this is just draft code. You guys can test it. let me explain the code that I changed The first: I change the code in 8.2.0 which is non-inline but is inline in 8.0.27 to inline, actually I just copy the code from 8.0.27. 8.0.27 rec_get_nth_field => inline rec_get_nth_field_offs => inline rec_init_offsets_comp_ordinary => inline rec_offs_nth_extern => inline 8.2.0 rec_get_nth_field => non-inline rec_get_nth_field_offs => non-inline rec_init_offsets_comp_ordinary => non-inline rec_offs_nth_extern => non-inline The second: Ignore that overhead by function get_rec_insert_state which is define as inline, but actually is not inline. get_rec_insert_state which is define inline, actually this function is not inline. in Marcos Albe 8.0.33.markus.svg, we can fint this function is still in the flame diagram, since if function is actully inline, the function won't exist in program's symbol table, then it won't exist in the flame diagram, I guess this is cause by adding too many if/else branch in the function. Then even this function is define as inline, However, the compiler thinks this function has too complex so it won't make it as inline. so in rec_init_null_and_len_comp() it will not call get_rec_insert_state() function. Then the overhead cause by get_rec_insert_state will disappear. The third: Another fix is change the switch case in function rec_init_offsets_comp_ordinary to if/else. And using likely to tell the compiler which branch is more likely to hit.
[24 Nov 2023 21:25]
Zongzhi Chen
baotiao.diff
Attachment: baotiao.diff (application/octet-stream, text), 21.02 KiB.
[24 Nov 2023 21:31]
Satya Bodapati
I did see some improvements in 8.0.30 but not so much in 8.0.33 but I believe this adds another 3-4% drop in performance. I have a patch that converted the entire rem0wrec.cc to rem0wrec.ic. May be lets create a separate bug for this. There are two reasons why the performance went down. 1. inline to non-inline of rec_* functions 2. costly hashing algorithm that affects both AHI & non-AHI (buffer pool hash)
[24 Nov 2023 21:36]
Satya Bodapati
"get_rec_insert_state which is define inline, actually this function is not inline. in Marcos Albe 8.0.33.markus.svg, we can fint this function is still in the flame diagram, since if function is actully inline, the function won't exist in the program's symbol table, then it won't exist in the flame diagram, I guess this is cause by adding too many if/else branch in the function. Then even this function is define as inline, However, the compiler thinks this function has too complex so it won't make it as inline." this is right, we can see the inline decisions by using to compiler options. -fdump-ipa-inline. The max stack size reaches for get_rec_insert_state(). and compiler decides to not inline.
[16 Jan 3:57]
Ramakrishnan Kamalakannan
Posted by developer: Hi all, I notice that the dataset size is <1GB. Can you please reproduce this with a bigger dataset (at-least 10GB)? A smaller dataset may give spurious results as it plays more with CPU cache than RAM or storage. I too shall try with a larger dataset
[21 Jan 19:43]
Zongzhi Chen
Hello Ramakrishnan I do not use the same test case with Marcos, I use my own test case which data size is larger than 10G. the query that need back to the primary index can reproduce the bottleneck easily.
[18 Mar 9:30]
Aaditya Dubey
Hi Ramakrishnan, Could you please update if this issue is fixed by MySQL 8.0.36?
[20 Mar 3:36]
Ramakrishnan Kamalakannan
Posted by developer: Hi, this is not fixed in 8.0.36. I am still working on the fix. Sorry about the delay
[4 Jun 13:54]
ANIL JOSHI
Hi, I see some fixes done with respect to the hash algorithm in [8.0.36] "InnoDB: The hash function used by the adaptive hash index (AHI) was improved to increase performance. (Bug #35449386)" Is that have some linkage with the below-running concerns? 1)inline to non-inline bug 2)performance degradation of the Hash Algorithm If that patch/fix not related with this ongoing performance issue , then can you pleas specify the timelines when this going to be fixed ?
[5 Jun 14:42]
ANIL JOSHI
Well, I performed some tests with older and new [8.0.37], and it seems that performance improves a bit in 8.0.37. Is by any chance that bug/issue fixed in this latest release[8.0.37]? Although, in bug fix documentation I didn't see any mentioning about that. I am adding my test results in a file shortly...
[5 Jun 14:43]
ANIL JOSHI
MySQL 8 performance test in different versions
Attachment: MySQL Benchmark test (application/octet-stream, text), 7.75 KiB.
[6 Jun 10:10]
Aaditya Dubey
Hi Ramakrishnan, Could you please update us on when we will receive the fix? Any ETA would be much appreciated.
[12 Jun 4:33]
Ramakrishnan Kamalakannan
Posted by developer: Hi, The costly hash function is fixed in version 8.0.36. The inline functions fix is under investigation. I've found that `rec_init_offsets_comp_ordinary` is hot on CPU and I'm trying to improve this function. Meanwhile, the patch for inline functions is in review.
[3 Jul 8:35]
Aaditya Dubey
Hi Ramakrishnan, Thank you for the updates. We will wait for more updates on the fix.
[24 Jul 22:10]
Jon Stephens
There are actually 2 fixes made here, one in MySQL 8.0.36, and one in MySQL 8.0.40/8.4.3/9.1.0. *1st part*: Documented in the MySQL 8.0.36 changelog as follows: The hashing function used by the InnoDB adaptive hash index became more expensive as an unintended result of refactoring done in MySQL 8.0.30. *2nd part*: Documented in the MySQL 8.0.40, 8.4.3, and 9.1.0 changelogs as follows: Several functions internal to InnoDB which were defined as inline in MySQL 8.0.28 were found to be no longer inline in MySQL 8.0.33, due in part to refactoring which accompanied improvements made in MySQL 8.0.30 to improve the InnoDB adaptive hash index. This had an adverse effect on queries using joins on InnoDB tables. Closed.
[25 Jul 7:25]
zongyi chen
Hi, Jon Stephens.Is part2 merge? I do not see in branch 8.0.40 yet.