Bug #106712 | innodb adaptive hash index slows single query/OLAP workload | ||
---|---|---|---|
Submitted: | 12 Mar 2022 5:50 | Modified: | 16 Mar 2022 21:25 |
Reporter: | Justin Swanhart | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S4 (Feature request) |
Version: | 8.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[12 Mar 2022 5:50]
Justin Swanhart
[12 Mar 2022 5:52]
Justin Swanhart
the entire schema is available in the other bug, I just gave the lineorder table here so that you could see the indexes I have defined. as long as one secondary index is on the table, this problem will happen (hash join is used otherwise)
[12 Mar 2022 6:06]
Justin Swanhart
note: the query has to run for quite some time to fill up the buffer pool but once the buffer pool is full the slowdown from the AHI is most pronounced.
[14 Mar 2022 14:44]
MySQL Verification Team
Hi Mr. Swanhart, Thank you for your bug report. However, we need additional info. Can you please let us know how much of the buffer pool is occupied with AHI cache, once when you start experiencing slowdown. Next, have you tried limiting AHI to certain amount ??? Does it change performance in any way ??? It is very well known phenomena that AHI can take a large part of the buffer pool, thus inducing a slowdown. There are a number of configuration variable for AHI. Have you tried all of them with different combinations ???? We would like to hear from you in that respect. Needless to say, you can completely turn the AHI off, if you wish so. So far, it does look like expected behaviour, but we would like to learn more. Waiting on your feedback.
[14 Mar 2022 14:57]
MySQL Verification Team
Hi, Only a small additional help on the AHI. You can get a total memory usage of AHI with : SHOW ENGINE INNODB STATUS\G
[14 Mar 2022 19:00]
Justin Swanhart
I have not tuned the AHI but this is a single threaded complex query that is using nested loops against a table who's index does not fit into the AHI. This is the show engine innodb status with the AHI on: ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 1, free list len 0, seg size 2, 0 merges merged operations: insert 0, delete mark 0, delete 0 discarded operations: insert 0, delete mark 0, delete 0 Hash table size 5312557, node heap has 143110 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 1269 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) 46537.72 hash searches/s, 14187.34 non-hash searches/s --- LOG --- Log sequence number 85522901653 Log buffer assigned up to 85522901653 Log buffer completed up to 85522901653 Log written up to 85522901653 Log flushed up to 85522901653 Added dirty pages up to 85522901653 Pages flushed up to 85522901653 Last checkpoint at 85522901653 424953 log i/o's done, 0.00 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total large memory allocated 0 Dictionary memory allocated 481221 Buffer pool size 1310605 Free buffers 0 Database pages 1166226 Old database pages 430337 Modified db pages 0 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 2393116, not young 466599 2165.98 youngs/s, 17.77 non-youngs/s Pages read 1146867, created 298982, written 672155 4.48 reads/s, 0.00 creates/s, 0.00 writes/s Buffer pool hit rate 1000 / 1000, young-making rate 15 / 1000 not 0 / 1000 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s LRU len: 1166226, unzip_LRU len: 0 I/O sum[2288]:cur[16], unzip sum[0]:cur[0] -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue 1 read views open inside InnoDB Process ID=22755, Main thread ID=140280157083392 , state=sleeping Number of rows inserted 6235771, updated 0, deleted 0, read 529683363 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 60723.79 reads/s Number of system rows inserted 51, updated 318, deleted 52, read 5411 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ As you can see, almost all the activity is in a single partition, which makes sense because the index is in a single table. If the adaptive hash index could use more of the AHI than a single partition, performance might be improved. Then with the AHI disabled, the server is serving up 10x rows /sec ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 1, free list len 0, seg size 2, 0 merges merged operations: insert 0, delete mark 0, delete 0 discarded operations: insert 0, delete mark 0, delete 0 Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) Hash table size 5312557, node heap has 0 buffer(s) 0.00 hash searches/s, 579205.86 non-hash searches/s I think the AHI is working "as designed" but it is not designed for these kind of queries. So I guess this is a FR for a better performing AHI when there is one busy index in a query.
[15 Mar 2022 0:23]
Justin Swanhart
update synopsis
[15 Mar 2022 0:24]
Justin Swanhart
note: query time w/out adaptive hash index: 600s query time w/ adaptive hash index: 2200s
[15 Mar 2022 0:31]
Justin Swanhart
To summarize: In a single query, typically the amount of the AHI that can be effectively used is basically just 1/N where N is the number of partitions of the AHI because tables are hashed into the AHI by table_id. Therefore the AHI becomes very hot for this query which is going from dim_date->lineorder. The LO_OrderDateKey is hot in the AHI, it can't fit so it does constant evictions which are expensive compared to non-hash lookups. If I drop the indexes on the lineorder table, the query runs in 600 seconds because lookups are all done into the dimension tables, which fit into the AHI quite nicely.
[15 Mar 2022 13:18]
MySQL Verification Team
Hi, It is obvious from the output that your server requires tuning , especially with AHI. Please, try this and then let us know the results.
[15 Mar 2022 16:32]
Justin Swanhart
I am using --innodb-dedicated-server=ON Can you suggest some tuning parameters for the AHI for this workload? I can try NOT partitioning it for example.
[15 Mar 2022 16:43]
Justin Swanhart
With one adaptive hash index partition (previous was 8): Hash table size 42499631, node heap has 4 buffer(s) with adpative_hash_index=ON -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue 1 read views open inside InnoDB Process ID=30413, Main thread ID=139931859699456 , state=sleeping Number of rows inserted 0, updated 0, deleted 0, read 1710338 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 46867.69 reads/s Number of system rows inserted 0, updated 317, deleted 0, read 4967 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s disabling the adaptive hash index shows a good perf increase after 90 seconds: -------------- ROW OPERATIONS -------------- 0 queries inside InnoDB, 0 queries in queue 1 read views open inside InnoDB Process ID=30413, Main thread ID=139931859699456 , state=sleeping Number of rows inserted 0, updated 0, deleted 0, read 23748698 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 275709.32 reads/s Number of system rows inserted 0, updated 317, deleted 0, read 4967 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s will post result with 64 partitions in few minutes
[15 Mar 2022 16:48]
Justin Swanhart
Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 1 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 4 buffer(s) Hash table size 1328213, node heap has 461 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 1266 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) Hash table size 1328213, node heap has 0 buffer(s) 19683.11 hash searches/s, 20524.53 non-hash searches/s 0 queries inside InnoDB, 0 queries in queue 1 read views open inside InnoDB Process ID=30562, Main thread ID=139775137679104 , state=sleeping Number of rows inserted 0, updated 0, deleted 0, read 1245723 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 40183.32 reads/s Number of system rows inserted 0, updated 317, deleted 0, read 4960 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 8.97 reads/s after 90 seconds after disabling AHI, performance is increased: 0.00 hash searches/s, 333207.75 non-hash searches/s I am unaware of any tuning options other than the number of partitions and as I explained, one partition will always be "hot" for this query and will do constant evictions from that partition! Due to the hashing of partition selection by table_id TWO partitions are all that will ever be used, no matter how many partitions I select. Note that only two partitions are used with 64 partitions and thus the available AHI space is even worse than the 8 or 1 partition cases.
[16 Mar 2022 13:41]
MySQL Verification Team
Hi Justin, What you report can be treated as a feature request. We have the idea about limiting the memory or the percentage of the buffer pool that can be used for AHI. Do you want us to verify your report as this feature request or do you have any other idea ?????
[16 Mar 2022 15:44]
Justin Swanhart
I think if possible when considering a row for hashing into the buffer pool, determine if the index is larger than memory, and don't hash it. That is my only idea at this point except changing the adaptive hash index so that it doesn't choose a partition based on index_id.
[16 Mar 2022 17:39]
MySQL Verification Team
Hi Mr. Swanhart, Your report is now verified as a feature request. Both of your ideas will be analysed. Verified.
[16 Mar 2022 20:58]
Marcin Babij
Hello Justin. Thank you for the report. You're right that the records are hashed by table ID (index_id more specifically), so they end up in one or two buckets. Your case where there is a lot of LRU eviction, causing a lot of pages going out of the AHI. This is in turn executed while the AHI part is X-latched, so any other operations on AHI, including searches, are blocked. You've shared steps to reproduce, we may try to try these, if time allows. It will be a very good benchmark for future improvements. Currently we are working on ideas from https://bugs.mysql.com/bug.php?id=100512 Bug #100512 adaptive hash index(AHI) building causing contention on btr_search_latches which in case like your, will allow searches to continue without AHI, if AHI part is X-latched, and even adding new pages to AHI will not block if AHI is X-latched, which will reduce number of pages in AHI and reduce average pressure on evictions. We have even some ideas how the hashmaps could be changed or moved to lock-free variants to further reduce need for rwlocking or better distribution between parts, but that's far future, especially given that the AHI is not a critical feature. Given it's nature and current implementation, I can only advice you to turn off AHI in case there are any problems caused by it, this can be done even dynamically during server runtime. And please leave this report open as "feature request" for future reference.
[16 Mar 2022 21:25]
Justin Swanhart
Thanks for the info. FWIW I applied the patch from the other bug and it doesn't help the situation here very much. It gave me an extra 30K rows/sec (~100K/sec total) with the AHI turned on, but with the AHI turned off the server was doing 600K rows/sec.