MySQL Bugs: #113647: identical tables are checked in HashJoin because of no STOP_AT

Bug #113647	identical tables are checked in HashJoin because of no STOP_AT_MATERIALIZATION
Submitted:	16 Jan 2024 8:55	Modified:	7 Mar 2024 20:37
Reporter:	tianfeng li (OCA)	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S5 (Performance)
Version:	8.0.35,8.2.0	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
In the function WalkTablesUnderAccessPath(), the policy WalkAccessPathPolicy::STOP_AT_MATERIALIZATION is used. However, there is no judgment or handling of this policy in the basic method WalkAccessPaths(). Since all specific logical controls are determined by the passed-in lambda function, if it returns true, the current path search will stop. Therefore, the lambda function needs to describe and explain the logic of STOP_AT_MATERIALIZATION.

However, both the WalkAccessPaths() function and the lambda function do not judge the STOP_AT_MATERIALIZATION policy, making this policy seem like a false concept. From the calling hierarchy, the parameter STOP_AT_MATERIALIZATION passed to the WalkAccessPaths() function should be identified and processed by this function.

This confusing logic will have unexpected effects, such as GetUsedTables searching tables under the materialized table. This directly affects the build input tables and probe input tables of the hash join, and the tables under the materialized table are added to the tables. Although the repeated tables are added, it will not affect the correctness due to the influence of the table bitmap. However, it is worth mentioning that there are several places in the hash join where the table list traversal is processed, and the table objects that are repeatedly added to the list will also execute some instructions repeatedly.

Here is a specific example:

```
create table probe_table(
  id int primary key,
  col int);
insert into probe_table values(1,1);
create table build_table(
    id int primary key,
    col int);
insert into build_table values(1,1);
explain format=tree select
   subq_0.c0 as c0
 from
   (
     select
       ref_1.col as c0
     from
       probe_table as ref_1
     where
       ref_1.id is NOT NULL
     group by
       1
   ) as subq_0
   left join build_table as ref_0 on (true) order by 1;
```

The plan is:

```
| -> Sort: subq_0.c0
    -> Stream results  (cost=0.53 rows=1)
        -> Left hash join (no condition)  (cost=0.53 rows=1)
            -> Table scan on subq_0  (cost=2.51..2.51 rows=1)
                -> Materialize  (cost=5.58..5.58 rows=1)
                    -> Table scan on <temporary>  (cost=2.51..2.51 rows=1)
                        -> Temporary table with deduplication  (cost=2.96..2.96 rows=1)
                            -> Table scan on ref_1  (cost=0.35 rows=1)
            -> Hash
                -> Index scan on ref_0 using PRIMARY  (cost=0.18 rows=1)
```

GDB:

```
(gdb) b HashJoinIterator::ReadRowFromProbeIterator
Breakpoint 1 at 0x16fec77: file sql/iterators/hash_join_iterator.cc
(gdb) 
Continuing.

Breakpoint 2, HashJoinIterator::ReadRowFromProbeIterator (this=0x7f7dfbc953a0)
	sql/iterators/hash_join_iterator.cc

(gdb) p m_probe_input_tables.m_tables.m_buff[0].table
$1 = (TABLE *) 0x7f88a95b9500
(gdb) p m_probe_input_tables.m_tables.m_buff[1].table
$2 = (TABLE *) 0x7f88a95b9500
```

In this scenario, two identical table pointers are added to m_probe_input_tables, causing handler->position(table->record[0]) to be executed twice in RequestRowId(). At this time, each record on the Probe side will be executed one more time, which may have some performance impact. In the implementation of HashJoin, the repeated TABLE pointers may introduce redundant call execution in the INIT, BUILD, and PROBE stages.

Therefore, there are two issues that need to be discussed:
1. In the implementation of HashJoin, repeated TABLE pointers may have performance impact.
2. All code logic that uses STOP_AT_MATERIALIZATION may need to be evaluated to determine whether it really needs to stop at MATERIALIZATION.

How to repeat:
As above.

Hello tianfeng li,

Thank you for the report and feedback.

regards,
Umesh

Documented fix as follows in the MySQL 8.4.0 changelog:

    In queries that materialized rows in a temporary table before
    performing hash join or streaming aggregation, data was
    sometimes copied twice from the temporary table to the join
    buffer or aggregation buffer. While this did not cause any wrong
    results, it led to inefficient use of buffer space with a
    possible negative impact on performance.

    This was due to WalkTablesUnderAccessPath() visiting tables in
    MATERIALIZE access paths twice: first when it saw the
    MATERIALIZE access path, and then again when it visited the
    table_path member of the MATERIALIZE access path.

    We fix this by not visiting the table when seeing the
    MATERIALIZE path, and doing so only when seeing the table_path
    below MATERIALIZE.

Closed.