Bug #99933 In-memory hash join will only use two-thirds of join buffer
Submitted: 19 Jun 2020 1:30 Modified: 11 Nov 2020 7:02
Reporter: Øystein Grøvlen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[19 Jun 2020 1:30] Øystein Grøvlen
Description:
This issue was discovered by code inspection:
After inserting a new row into the join buffer, HashJoinRowBuffer::StoreRow() does this check:

  if (m_mem_root.allocated_size() > m_max_mem_available) {
    return StoreRowResult::BUFFER_FULL;
  }

(see bottom of sql/hash_join_buffer.cc)

The problem is that MEM_ROOT::allocated_size() returns the amount memory allocated to the MEM_ROOT, not the amount of memory that is currently in use.
This means that when a row insert triggers the final allocation to the MEM_ROOT, one will switch from in-memory to hybrid hash join. (See handling of StoreRowResult::BUFFER_FULL in HashJoinIterator::BuildHashTable).

Since the allocation size is increased by 50% each time more memory is added to the MEM_ROOT, the last allocation will be approx. 1/3 of the size of the join buffer.  

How to repeat:
It is difficult to observe this issue since it is not straight-forward to know how much memory will actually be needed for each row of the hash table.

Suggested fix:
Set MEM_ROOT:m_max_capacity to the maximum size of the join buffer.  If one tries to allocate more memory than allowed, allocations will then either fail with error (if m_error_for_capacity_exceeded is set) or a null pointer will be returned.  This scenario will then need to be handled all places where there is an allocation from join buffer MEM_ROOT.
[19 Jun 2020 12:16] MySQL Verification Team
Hi Øystein Grøvlen,

Thank you for your bug report.

I have analysed the code that you write about , with a very high scrutiny. Both in 8.0.20 and in trunk.

Your analysis is, simply, correct.

Verified as reported.
[4 Sep 2020 19:06] Jon Stephens
Fixed in MySQL 8.0.23 as part of WL#13459 "Optimize hash table in hash join". See same for docs info.

Closed.
[21 Sep 2020 12:55] MySQL Verification Team
Thank you, Jon.
[11 Nov 2020 7:57] Jon Stephens
NB: WL#13459 has now been pushed to mysql-8.0 and an entry for it added to the MySQL 8.0 changelog under the heading "Optimizer Notes". See also "What Is New In MySQL 8.0".
[11 Nov 2020 8:38] Jon Stephens
Previous comment should say "8.0.23 changelog".