| Bug #99933 | In-memory hash join will only use two-thirds of join buffer | ||
|---|---|---|---|
| Submitted: | 19 Jun 2020 1:30 | Modified: | 11 Nov 2020 7:02 |
| Reporter: | Øystein Grøvlen | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 8.0 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[19 Jun 2020 12:16]
MySQL Verification Team
Hi Øystein Grøvlen, Thank you for your bug report. I have analysed the code that you write about , with a very high scrutiny. Both in 8.0.20 and in trunk. Your analysis is, simply, correct. Verified as reported.
[4 Sep 2020 19:06]
Jon Stephens
Fixed in MySQL 8.0.23 as part of WL#13459 "Optimize hash table in hash join". See same for docs info. Closed.
[21 Sep 2020 12:55]
MySQL Verification Team
Thank you, Jon.
[11 Nov 2020 7:57]
Jon Stephens
NB: WL#13459 has now been pushed to mysql-8.0 and an entry for it added to the MySQL 8.0 changelog under the heading "Optimizer Notes". See also "What Is New In MySQL 8.0".
[11 Nov 2020 8:38]
Jon Stephens
Previous comment should say "8.0.23 changelog".

Description: This issue was discovered by code inspection: After inserting a new row into the join buffer, HashJoinRowBuffer::StoreRow() does this check: if (m_mem_root.allocated_size() > m_max_mem_available) { return StoreRowResult::BUFFER_FULL; } (see bottom of sql/hash_join_buffer.cc) The problem is that MEM_ROOT::allocated_size() returns the amount memory allocated to the MEM_ROOT, not the amount of memory that is currently in use. This means that when a row insert triggers the final allocation to the MEM_ROOT, one will switch from in-memory to hybrid hash join. (See handling of StoreRowResult::BUFFER_FULL in HashJoinIterator::BuildHashTable). Since the allocation size is increased by 50% each time more memory is added to the MEM_ROOT, the last allocation will be approx. 1/3 of the size of the join buffer. How to repeat: It is difficult to observe this issue since it is not straight-forward to know how much memory will actually be needed for each row of the hash table. Suggested fix: Set MEM_ROOT:m_max_capacity to the maximum size of the join buffer. If one tries to allocate more memory than allowed, allocations will then either fail with error (if m_error_for_capacity_exceeded is set) or a null pointer will be returned. This scenario will then need to be handled all places where there is an allocation from join buffer MEM_ROOT.