Bug #111538 MySQL 8.0.33 is slower than MySQL 8.0.28 with queries using JOINS
Submitted: 23 Jun 2023 0:17 Modified: 24 Jul 2024 22:10
Reporter: Marcos Albe (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.0.33 OS:Any
Assigned to: CPU Architecture:Any
Tags: performance degradation

[23 Jun 2023 0:17] Marcos Albe
Description:
mysql [localhost:8028] {msandbox} (employees) > SELECT *       FROM employees AS e LEFT JOIN salaries  AS s        ON e.emp_no = s.emp_no LEFT JOIN dept_manager AS dm        ON e.emp_no = dm.emp_no LEFT JOIN titles AS t        ON e.emp_no = t.emp_no      JOIN dept_emp AS de        ON de.emp_no = e.emp_no      JOIN departments AS d        ON d.dept_no = de.dept_no WHERE t.title='Senior Engineer' OR de.dept_no='d002' OR gender='M' LIMIT 300000;
81fb21c9ef43a5820f74741694b5d0f9  -
300000 rows in set (1.00 sec)

further runs:
+----------+------------+-------------------------------------------------------------+
| Query_ID | Duration   | Query                                                       |
+----------+------------+-------------------------------------------------------------+
|       25 | 0.99907400 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       26 | 1.02636325 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       27 | 1.00469100 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       28 | 1.00176800 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       29 | 0.99846125 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       30 | 0.99761500 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       31 | 0.99803350 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       32 | 0.99647050 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       33 | 0.99251800 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       34 | 1.00336950 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       35 | 1.00860275 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       36 | 1.00331600 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       37 | 0.99613800 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       38 | 0.99846850 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       39 | 1.00184000 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
+----------+------------+-------------------------------------------------------------+
 

Running the exact same thing on 8.0.32 yields this other profile:
+----------+------------+-------------------------------------------------------------+
| Query_ID | Duration   | Query                                                       |
+----------+------------+-------------------------------------------------------------+
|       15 | 1.06721400 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       16 | 1.06410475 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       17 | 1.06400100 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       18 | 1.06553250 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       19 | 1.07420650 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       20 | 1.08381275 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       21 | 1.06726650 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       22 | 1.06335300 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       23 | 1.06272425 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       24 | 1.07936525 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       25 | 1.06796425 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       26 | 1.06650575 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       27 | 1.07216800 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       28 | 1.06543000 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
|       29 | 1.06800475 | SELECT * FROM employees AS e LEFT JOIN salaries  AS s   ... |
+----------+------------+-------------------------------------------------------------+

So on 8.0.32 we can see an instant 6% - 8% regression. And it's even worse in 8.0.33, where we see 17% - 18% regression:
+----------+------------+-------------------------------------------------------------+
| Query_ID | Duration   | Query                                                       |
+----------+------------+-------------------------------------------------------------+
|        1 | 1.27622225 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|        2 | 1.17281275 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|        3 | 1.17511200 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|        4 | 1.17141100 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|        5 | 1.18441375 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|        6 | 1.17105550 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|        7 | 1.16941600 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|        8 | 1.17797625 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|        9 | 1.17390800 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|       10 | 1.16769900 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|       11 | 1.17499600 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|       12 | 1.18346875 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|       13 | 1.16873475 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
|       14 | 1.17847925 | SELECT *  FROM employees AS e LEFT JOIN salaries  AS s  ... |
+----------+------------+-------------------------------------------------------------+

How to repeat:
dbdeployer deploy single 8.0.28;
dbdeployer deploy single 8.0.32;
dbdeployer deploy single 8.0.33;

dbdeployer data-load get employees msb_8_0_28;
dbdeployer data-load get employees msb_8_0_32;
dbdeployer data-load get employees msb_8_0_33;

Then run this on each sandbox:

SET profiling=1;
PAGER md5sum;
SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LEFT JOIN dept_manager AS dm ON e.emp_no = dm.emp_no LEFT JOIN titles AS t ON e.emp_no = t.emp_no JOIN dept_emp AS de ON de.emp_no = e.emp_no JOIN departments AS d ON d.dept_no = de.dept_no WHERE t.title='Senior Engineer' OR de.dept_no='d002' OR gender='M' LIMIT 300000;
-- repeat query 10 - 20 times 
NOPAGER;
SHOW PROFILES;
SET profiling=0;

Suggested fix:
No clue; This looks closely related to previous one which Satya filed: https://bugs.mysql.com/bug.php?id=109361  But it's not exactly the same.
[23 Jun 2023 0:50] Marcos Albe
I actually tested with much simpler query, and still visible:

8.0.28
+----------+------------+--------------------------------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                                                      |
+----------+------------+--------------------------------------------------------------------------------------------+
|       21 | 0.67210475 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       22 | 0.73221925 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       23 | 0.64774975 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       24 | 0.67075575 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       25 | 0.67332025 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       26 | 0.66537300 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       27 | 0.67452900 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       28 | 0.72930675 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       29 | 0.71833350 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       30 | 0.65857525 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       31 | 0.65274575 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       32 | 0.69768975 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       33 | 0.69682700 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       34 | 0.72477825 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       35 | 0.64526125 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
+----------+------------+--------------------------------------------------------------------------------------------+

+----------------------------+--------+
| Variable_name              | Value  |
+----------------------------+--------+
| Handler_read_key           | 105218 |
| Handler_read_next          | 999999 |
| Handler_read_rnd_next      | 105217 |
+----------------------------+--------+

+----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key     | key_len | ref                | rows   | filtered | Extra |
+----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+
|  1 | SIMPLE      | e     | NULL       | ALL  | NULL          | NULL    | NULL    | NULL               | 299069 |   100.00 | NULL  |
|  1 | SIMPLE      | s     | NULL       | ref  | PRIMARY       | PRIMARY | 4       | employees.e.emp_no |      9 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+

8.0.33
+----------+------------+--------------------------------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                                                      |
+----------+------------+--------------------------------------------------------------------------------------------+
|       21 | 0.85898000 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       22 | 0.79197600 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       23 | 0.83340700 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       24 | 0.77812325 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       25 | 0.77719400 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       26 | 0.81970475 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       27 | 0.76522925 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       28 | 0.79419200 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       29 | 0.80916500 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       30 | 0.82678150 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       31 | 0.86895875 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       32 | 0.83041300 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       33 | 0.80798750 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       34 | 0.81469850 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
|       35 | 0.79329575 | SELECT *  FROM employees AS e LEFT JOIN salaries AS s ON e.emp_no = s.emp_no LIMIT 1000000 |
+----------+------------+--------------------------------------------------------------------------------------------+

+----------------------------+--------+
| Variable_name              | Value  |
+----------------------------+--------+
| Handler_read_key           | 105218 |
| Handler_read_next          | 999999 |
| Handler_read_rnd_next      | 105217 |
+----------------------------+--------+

+----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key     | key_len | ref                | rows   | filtered | Extra |
+----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+
|  1 | SIMPLE      | e     | NULL       | ALL  | NULL          | NULL    | NULL    | NULL               | 292025 |   100.00 | NULL  |
|  1 | SIMPLE      | s     | NULL       | ref  | PRIMARY       | PRIMARY | 4       | employees.e.emp_no |      9 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+---------+---------+--------------------+--------+----------+-------+

So, same explain, same number of rows read but 8.0.28 averaged 0.6839712833 while 8.0.33 average was 0.8113404167, so again 18% more
[23 Jun 2023 0:53] Marcos Albe
Run on employees database

Attachment: repro.sql (application/octet-stream, text), 1.74 KiB.

[23 Jun 2023 1:05] Marcos Albe
Flamegraph for 8.0.28

Attachment: 8.0.28.markus.svg (image/svg+xml, text), 59.48 KiB.

[23 Jun 2023 1:05] Marcos Albe
Flamegraph for 8.0.33

Attachment: 8.0.33.markus.svg (image/svg+xml, text), 65.53 KiB.

[23 Jun 2023 9:12] MySQL Verification Team
Hello Marcos,

Thank you for the report and feedback.

Thanks,
Umesh
[23 Jun 2023 11:44] Marcos Albe
Seems is not related to joins after all, as single-table queries also see like 20% drop in performance:
8.0.28:

mysql [localhost:8028] {msandbox} (employees) > select sum(salary) from salaries;
+--------------+
| sum(salary)  |
+--------------+
| 181480757419 |
+--------------+
1 row in set (0.43 sec)

mysql [localhost:8028] {msandbox} (employees) > select sum(salary) from salaries;
+--------------+
| sum(salary)  |
+--------------+
| 181480757419 |
+--------------+
1 row in set (0.43 sec)

mysql [localhost:8028] {msandbox} (employees) > select sum(salary) from salaries;
+--------------+
| sum(salary)  |
+--------------+
| 181480757419 |
+--------------+
1 row in set (0.43 sec)

.vs.

8.0.33:
mysql [localhost:8033] {msandbox} (employees) > select sum(salary) from salaries;
+--------------+
| sum(salary)  |
+--------------+
| 181480757419 |
+--------------+
1 row in set (0.53 sec)

mysql [localhost:8033] {msandbox} (employees) > select sum(salary) from salaries;
+--------------+
| sum(salary)  |
+--------------+
| 181480757419 |
+--------------+
1 row in set (0.52 sec)

mysql [localhost:8033] {msandbox} (employees) > select sum(salary) from salaries;
+--------------+
| sum(salary)  |
+--------------+
| 181480757419 |
+--------------+
1 row in set (0.52 sec)
[21 Nov 2023 10:34] Satya Bodapati
I debugged this issue as part of https://jira.percona.com/browse/PS-8822

I found that the regression causing commit is https://github.com/mysql/mysql-server/commit/b11a175924194d574238f42068f09b15924ae2f8

      Bug #16739204 IMPROVE THE INNODB HASH FUNCTION
    

    Bug #23584861 	INNODB ADAPTIVE HASH INDEX USES A BAD PARTITIONING ALGORITHM FOR THE REAL WORLD

The InnoDB hash and random generator functions were improved in most important properties.
The hash methods are all changed to work on 64bit values instead of platform dependent type. Some were based on ideas from Tomek Czajka's blog https://sortingsearching.com/2020/05/21/hashing.html
The old `ut_hash_ulint` was calculating hash using only one xor with a 32bit value, so the hashes of values of multiple of 2^n had all the lowest bits always the same. It is replaced with a method that calculates 64bit out of 64bit value. The Tabulation Hashing is used as the main hashing algorithm. It has good random distribution of common input sequences, a property that is guarded by a new unit tests.
The `ut_rnd_gen_ulint` and its helper `ut_rnd_gen_next_ulint` were providing numbers using 32bit constants and many, many operations. This generator provided random values in a cycle of just 85000. It was replaced by a simple function that provides fully valuable 64bits of hash with a new method to hash 64bit integers, resulting in 2^64 long cycle.
Hashing of two 64bit integers was improved in distribution for common sequences like <i,i>, <i,constant>, <constant, i>. Property guarded by a new unit test.
ut_fold_binary was renamed to `ut::hash_binary_ib` and used only for InnoDB's page checksum calculation for a non-default checksum algorithm. All other usages switched to a new method `ut::hash_binary` which bases on method for hashing two 64bit integers. For buffers longer than 15 bytes it switches to calculate hash using crc32 (and doing some more hashing to get 64bit result), which after recent improvements by Jakub Lopuszanski using native CPU instructions and execution pipeline parallelism to achieve very high speeds (multiple bytes per core per each single cycle). Overall the new method is much faster than old one.
The hash index calculation from fold value is done using a new utility class `fast_modulo_t` designed and implemented by Jakub. It does more costly precalculations once, and then allow fast modulo operations without using very slow integer division CPU instructions. It's a little more than 3 times faster and calculating it with division and only 2 times slower than GCC's compile-time optimized modulo by a constant. It also has a wrapper that allows calculations while the modulo value is changed concurrently using lock-free `Seq_lock` also implemented by Jakub.
A new `ut0math.h` is added with strictly math-related functions. The old `ut_find_prime` is extracted to it, and two new `divide_128` and `multiply_uint64` added. It also adds a new utility class `fast_modulo_t` and `mt_fast_modulo_t`.
The `multiply_uint64` which calculated 128bit result of 64bit multiplication is executed very frequently, so it has intrinsic methods for x86 implemented.

Additional or more detailed changes:
hash_create and hash_table_free were deleted. The hash initialization was moved to `hash_table_t` constructor. The hash destruction is done automatically using default destructor that clears `cells` array as it is `unique_ptr` now.
`hast_table_t` get new field `n_cells_fast_modulo` to allow fast hash index from fold calculation.
`btr_get_search_slot` is added to be responsible for the AHI part index calculation using global fast modulo structure `btr_ahi_parts_fast_modulo`.
`ut_rnd_interval` did return results excluding the value of `high`, contrary to the documentation.
New unit tests were added to test many new or modified hashing and random generating methods, assuring long generator cycle and good random distributions of hashes.

RB#27454

Change-Id: Icbc62d9e2ca44a702c18ea66e2ca048d9240494a
[21 Nov 2023 13:08] MySQL Verification Team
This reminds me of the internal bug I filed which directly related to the hashing algorithm by adaptive hash index.

Bug 35449386 - particular query performs much worse in 8.0 than 5.7 

which has got a patch coming in 8.0.36.

So please wait and check if that solves it.
[21 Nov 2023 13:20] Satya Bodapati
Thanks for the reply.

I see that non AHI paths are also affected. This is because of the hashing algorithm  used by buffer pool is also changed as part of the above mentioned commit.
[24 Nov 2023 21:24] Zongzhi Chen
I thought and I can confirm that this regression cause by the Feature Instant INSTANT ADD and DROP Column(s).

This commit turns  some hot patch function from inline to non-inline. And add some function that is not inline, even the function is define as inline.

I have change the code in 8.2.0 version, and it get almost the some performance as 8.0.27.

since I just want to confirm the regression cause by inline function, this is just draft code. You guys can test it.

let me explain the code that I changed

The first:

I change the code in 8.2.0 which is non-inline but is inline in 8.0.27 to inline, actually I just copy the code from 8.0.27.

8.0.27

rec_get_nth_field => inline

rec_get_nth_field_offs => inline

rec_init_offsets_comp_ordinary => inline

rec_offs_nth_extern => inline

8.2.0

rec_get_nth_field  => non-inline

rec_get_nth_field_offs  => non-inline

rec_init_offsets_comp_ordinary => non-inline

rec_offs_nth_extern => non-inline

The second:

Ignore that overhead by function get_rec_insert_state which is define as inline, but actually is not inline.

get_rec_insert_state  which is define inline, actually this function is not inline. in Marcos Albe 8.0.33.markus.svg, we can fint this function is still in the flame diagram, since if function is actully inline, the function won't exist in program's symbol table, then it won't exist in the flame diagram, I guess this is cause by adding too many if/else branch in the function. Then even this function is define as inline, However, the compiler thinks this function has too complex so it won't make it as inline.

so in rec_init_null_and_len_comp() it will not call get_rec_insert_state() function. Then the overhead cause by get_rec_insert_state  will disappear.

The third:

Another fix is change the switch case in function rec_init_offsets_comp_ordinary to if/else. And using likely to tell the compiler which branch is more likely to hit.
[24 Nov 2023 21:25] Zongzhi Chen
baotiao.diff

Attachment: baotiao.diff (application/octet-stream, text), 21.02 KiB.

[24 Nov 2023 21:31] Satya Bodapati
I did see some improvements in 8.0.30 but not so much in 8.0.33 but I believe this adds another 3-4% drop in performance.

I have a patch that converted the entire rem0wrec.cc to rem0wrec.ic. May be lets create a separate bug for this.

There are two reasons why the performance went down.
1. inline to non-inline of rec_* functions
2. costly hashing algorithm that affects both AHI & non-AHI (buffer pool hash)
[24 Nov 2023 21:36] Satya Bodapati
"get_rec_insert_state  which is define inline, actually this function is not inline. in Marcos Albe 8.0.33.markus.svg, we can fint this function is still in the flame diagram, since if function is actully inline, the function won't exist in the program's symbol table, then it won't exist in the flame diagram, I guess this is cause by adding too many if/else branch in the function. Then even this function is define as inline, However, the compiler thinks this function has too complex so it won't make it as inline."

this is right, we can see the inline decisions by using to compiler options. -fdump-ipa-inline. The max stack size reaches for get_rec_insert_state(). and compiler decides to not inline.
[16 Jan 2024 3:57] Ramakrishnan Kamalakannan
Posted by developer:
 
Hi all, I notice that the dataset size is <1GB.

Can you please reproduce this with a bigger dataset (at-least 10GB)? A smaller dataset may give spurious results as it plays more with CPU cache than RAM or storage. I too shall try with a larger dataset
[21 Jan 2024 19:43] Zongzhi Chen
Hello Ramakrishnan

I do not use the same test case with Marcos, I use my own test case which data size is larger than 10G.

the query that need back to the primary index can reproduce the bottleneck easily.
[18 Mar 2024 9:30] Aaditya Dubey
Hi Ramakrishnan,

Could you please update if this issue is fixed by MySQL 8.0.36?
[20 Mar 2024 3:36] Ramakrishnan Kamalakannan
Posted by developer:
 
Hi, this is not fixed in 8.0.36. I am still working on the fix. Sorry about the delay
[4 Jun 2024 13:54] ANIL JOSHI
Hi, I see some fixes done with respect to the hash algorithm in [8.0.36]

"InnoDB: The hash function used by the adaptive hash index (AHI) was improved to increase performance. (Bug #35449386)"

Is that have some linkage with the below-running concerns?  

1)inline to non-inline bug
2)performance degradation of the Hash Algorithm

If that patch/fix not related with this ongoing performance issue , then can you pleas specify the timelines when this going to be fixed ?
[5 Jun 2024 14:42] ANIL JOSHI
Well, I performed some tests with older and new [8.0.37], and it seems that performance improves a bit in 8.0.37. 

Is by any chance that bug/issue fixed in this latest release[8.0.37]? Although, in bug fix documentation I didn't see any mentioning about that.

I am adding my test results in a file shortly...
[5 Jun 2024 14:43] ANIL JOSHI
MySQL 8 performance test in different versions

Attachment: MySQL Benchmark test (application/octet-stream, text), 7.75 KiB.

[6 Jun 2024 10:10] Aaditya Dubey
Hi Ramakrishnan,

Could you please update us on when we will receive the fix? Any ETA would be much appreciated.
[12 Jun 2024 4:33] Ramakrishnan Kamalakannan
Posted by developer:
 
Hi,

The costly hash function is fixed in version 8.0.36.

The inline functions fix is under investigation. I've found that `rec_init_offsets_comp_ordinary` is hot on CPU and I'm trying to improve this function.

Meanwhile, the patch for inline functions is in review.
[3 Jul 2024 8:35] Aaditya Dubey
Hi Ramakrishnan,

Thank you for the updates. We will wait for more updates on the fix.
[24 Jul 2024 22:10] Jon Stephens
There are actually 2 fixes made here, one in MySQL 8.0.36, and one in MySQL 8.0.40/8.4.3/9.1.0.

*1st part*: Documented in the MySQL 8.0.36 changelog as follows:

    The hashing function used by the InnoDB adaptive hash index
    became more expensive as an unintended result of refactoring
    done in MySQL 8.0.30.

*2nd part*: Documented in the MySQL 8.0.40, 8.4.3, and 9.1.0 changelogs as follows:

    Several functions internal to InnoDB which were defined as inline 
    in MySQL 8.0.28 were found to be no longer inline in MySQL 8.0.33, 
    due in part to refactoring which accompanied improvements made in 
    MySQL 8.0.30 to improve the InnoDB adaptive hash index. This had 
    an adverse effect on queries using joins on InnoDB tables.

Closed.
[25 Jul 2024 7:25] zongyi chen
Hi, Jon Stephens.Is part2 merge? I do not see in branch 8.0.40 yet.
[21 Oct 2024 18:54] Mark Callaghan
This bug has been closed yet the regressions remain
https://smalldatum.blogspot.com/2024/10/mysql-8040-does-not-fix-regressions-i.html
[21 Oct 2024 19:17] Satya Bodapati
Hmm, I found the commit that inlined the rec_* functions.

https://github.com/mysql/mysql-server/commit/dfecee6d0802a5140de07753b1d5ac912a658ef7

Introduced in 8.0.40.

May be there is some more to the story.
[5 Nov 2024 5:31] Bin Wang
This is an interesting performance issue that has persisted for quite some time. Based on my testing, performance has still dropped significantly. Thanks for the discussion here—I'll look into this issue myself when I have some time.
[5 Nov 2024 6:13] Bin Wang
Based on our tests under identical conditions, MySQL 8.0.27 outperforms 8.0.40 by 15% with standard compilation. After applying PGO optimization to both, 8.0.27 still shows about 6% higher performance than 8.0.40.
[5 Nov 2024 6:44] Bin Wang
The conclusion above is based on a single SQL test result. Further testing shows that the results vary with different dataset sizes. The larger the data volume, the smaller the performance gap after applying PGO, with 8.0.40 even outperforming 8.0.27 in some cases.
[5 Nov 2024 6:48] Bin Wang
Test results for the PGO-optimized MySQL 8.0.27 version:

# Time: 2024-11-05T06:21:30.817988-00:00
# User@Host: root[root] @ localhost [127.0.0.1]  Id:    14
# Query_time: 14.059849  Lock_time: 0.000424 Rows_sent: 5000000  Rows_examined: 6087915
SET timestamp=1730787676;
SELECT *  FROM bmsql_order_line AS e LEFT JOIN bmsql_stock AS s ON e.ol_i_id = s.s_i_id LIMIT 5000000;
# Time: 2024-11-05T06:23:06.621859-00:00
# User@Host: root[root] @ localhost [127.0.0.1]  Id:    15
# Query_time: 3.232393  Lock_time: 0.000442 Rows_sent: 1000000  Rows_examined: 6027876
SET timestamp=1730787783;
SELECT *  FROM bmsql_order_line AS e LEFT JOIN bmsql_stock AS s ON e.ol_i_id = s.s_i_id LIMIT 1000000;
# Time: 2024-11-05T06:25:30.805732-00:00
# User@Host: root[root] @ localhost [127.0.0.1]  Id:    17
# Query_time: 0.354701  Lock_time: 0.000439 Rows_sent: 100000  Rows_examined: 845869
SET timestamp=1730787930;
SELECT *  FROM bmsql_order_line AS e LEFT JOIN bmsql_stock AS s ON e.ol_i_id = s.s_i_id LIMIT 100000;
# Time: 2024-11-05T06:35:52.084712-00:00
# User@Host: root[root] @ localhost [127.0.0.1]  Id:    18
# Query_time: 27.478906  Lock_time: 0.000439 Rows_sent: 10000000  Rows_examined: 6171959
SET timestamp=1730788524;
SELECT *  FROM bmsql_order_line AS e LEFT JOIN bmsql_stock AS s ON e.ol_i_id = s.s_i_id LIMIT 10000000;

Test results for the PGO-optimized MySQL 8.0.40 version:
# Time: 2024-11-05T06:20:40.957722-00:00
# User@Host: root[root] @ localhost [127.0.0.1]  Id:    10
# Query_time: 13.167529  Lock_time: 0.000005 Rows_sent: 5000000  Rows_examined: 6087949
SET timestamp=1730787627;
SELECT *  FROM bmsql_order_line AS e LEFT JOIN bmsql_stock AS s ON e.ol_i_id = s.s_i_id LIMIT 5000000;
# Time: 2024-11-05T06:22:55.078840-00:00
# User@Host: root[root] @ localhost [127.0.0.1]  Id:    11
# Query_time: 3.281110  Lock_time: 0.000005 Rows_sent: 1000000  Rows_examined: 6027791
SET timestamp=1730787771;
SELECT *  FROM bmsql_order_line AS e LEFT JOIN bmsql_stock AS s ON e.ol_i_id = s.s_i_id LIMIT 1000000;
# Time: 2024-11-05T06:25:23.946116-00:00
# User@Host: root[root] @ localhost [127.0.0.1]  Id:    13
# Query_time: 0.373206  Lock_time: 0.000005 Rows_sent: 100000  Rows_examined: 845468
SET timestamp=1730787923;
SELECT *  FROM bmsql_order_line AS e LEFT JOIN bmsql_stock AS s ON e.ol_i_id = s.s_i_id LIMIT 100000;
# Time: 2024-11-05T06:27:55.863055-00:00
# User@Host: root[root] @ localhost [127.0.0.1]  Id:    14
# Query_time: 25.397629  Lock_time: 0.000005 Rows_sent: 10000000  Rows_examined: 6172168
SET timestamp=1730788050;
SELECT *  FROM bmsql_order_line AS e LEFT JOIN bmsql_stock AS s ON e.ol_i_id = s.s_i_id LIMIT 10000000;
[7 Nov 2024 14:58] Bin Wang
After a long period of exploration, we’ve nearly identified the root cause of the performance degradation—it’s much more complex than just an inline issue. If you're interested, you can test our open-source version here: https://github.com/advancedmysql/mysql-8.0.40 (install from the source code, as the release version isn’t ready yet). Check if there’s an improvement in join performance, though this fix will require extensive testing.
[7 Nov 2024 15:08] Bin Wang
I primarily tested with BenchmarkSQL, focusing on the join operations within the BenchmarkSQL database. The performance difference between versions 8.0.27 and 8.0.40 was about 12%, but after optimization, it has been reduced to around 1%.
[8 Nov 2024 15:09] Satya Bodapati
@BinWang,

Right, with inline, it could be around 3-4% max improvement.

It looks like you fixed the problem at a higher level and improved join performance.

https://github.com/advancedmysql/mysql-8.0.40/commit/d347cdb9ce8861003b3ae8b8a63277e858a8f...

I have not entirely looked into the patch. Are you fixing the cache of EQ_REF type? This reminds me of https://bugs.mysql.com/bug.php?id=109361

One concern is related to the new hashing algorithm. It is not optimized when the access is sequential. The new hashing algorithm is also used for buffer pool page hash.

But I think some more commits have tried to address this hashing algorithm issue.
[9 Nov 2024 3:28] Bin Wang
Overall, I identified four areas that could impact performance, and made simplifications without affecting the original logic, leading to incremental improvements. It's quite complex to explain in detail, but one optimization I disagree with from the official release caused a performance degradation in joins, so I reverted that optimization.
[9 Nov 2024 3:31] Bin Wang
The first step is crucial — identifying the problem. The most challenging part is the second step: optimizing without affecting the original logic. This requires extensive trial, which consumed a significant amount of time.
[9 Nov 2024 22:02] Mark Callaghan
The results from the fix in Advanced MySQL 8.0.40 are great.
https://smalldatum.blogspot.com/2024/11/fixing-some-of-innodb-scan-perf.html