Description:
In the current MySQL8.0 documentation, section 8.2.1.4, the paragraph "In MySQL 8.0.20 and later, it is no longer necessary for the join to contain at least one equi-join condition in order for a hash join to be used. This means that the types of queries which can be optimized using hash joins include those in the following list (with examples):".Hash join is used in the following cases. The EXPLAIN in the example of half-join and anti-join does not seem to be correct.
The original:
Semijoin:
mysql> EXPLAIN FORMAT=TREE SELECT * FROM t1
-> WHERE t1.c1 IN (SELECT t2.c2 FROM t2)\G
*************************** 1. row ***************************
EXPLAIN: -> Nested loop inner join
-> Filter: (t1.c1 is not null) (cost=0.85 rows=6)
-> Table scan on t1 (cost=0.85 rows=6)
-> Single-row index lookup on <subquery2> using <auto_distinct_key> (c2=t1.c1)
-> Materialize with deduplication
-> Filter: (t2.c2 is not null) (cost=0.85 rows=6)
-> Table scan on t2 (cost=0.85 rows=6)
Antijoin:
mysql> EXPLAIN FORMAT=TREE SELECT * FROM t2
-> WHERE NOT EXISTS (SELECT * FROM t1 WHERE t1.col1 = t2.col1)\G
*************************** 1. row ***************************
EXPLAIN: -> Nested loop antijoin
-> Table scan on t2 (cost=0.85 rows=6)
-> Single-row index lookup on <subquery2> using <auto_distinct_key> (c1=t2.c1)
-> Materialize with deduplication
-> Filter: (t1.c1 is not null) (cost=0.85 rows=6)
-> Table scan on t1 (cost=0.85 rows=6)
The example given here does not seem to work with Hash Join
The actual local test is:
CREATE TABLE t1 (c1 INT, c2 INT);
CREATE TABLE t2 (c1 INT, c2 INT);
CREATE TABLE t3 (c1 INT, c2 INT);
Semijoin:
mysql> explain format=tree select * from t1 where t1.c1 in (select t2.c2 from t2)\G
*************************** 1. row ***************************
EXPLAIN: -> Hash semijoin (t2.c2 = t1.c1) (cost=0.70 rows=1)
-> Table scan on t1 (cost=0.35 rows=1)
-> Hash
-> Table scan on t2 (cost=0.35 rows=1)
Antijoin:
mysql> explain format=tree select * from t2 where not exists (select * from t1 where t1.c1 =t2.c1 )\G
*************************** 1. row ***************************
EXPLAIN: -> Hash antijoin (t1.c1 = t2.c1) (cost=0.70 rows=1)
-> Table scan on t2 (cost=0.35 rows=1)
-> Hash
-> Table scan on t1 (cost=0.35 rows=1)
I wonder if this document test does not use a version later than 8.0.20, or is it my own fault?
How to repeat:
CREATE TABLE t1 (c1 INT, c2 INT);
CREATE TABLE t2 (c1 INT, c2 INT);
CREATE TABLE t3 (c1 INT, c2 INT);
Semijoin:
mysql> explain format=tree select * from t1 where t1.c1 in (select t2.c2 from t2)\G
*************************** 1. row ***************************
EXPLAIN: -> Hash semijoin (t2.c2 = t1.c1) (cost=0.70 rows=1)
-> Table scan on t1 (cost=0.35 rows=1)
-> Hash
-> Table scan on t2 (cost=0.35 rows=1)
Antijoin:
mysql> explain format=tree select * from t2 where not exists (select * from t1 where t1.c1 =t2.c1 )\G
*************************** 1. row ***************************
EXPLAIN: -> Hash antijoin (t1.c1 = t2.c1) (cost=0.70 rows=1)
-> Table scan on t2 (cost=0.35 rows=1)
-> Hash
-> Table scan on t1 (cost=0.35 rows=1)
Suggested fix:
Please test whether the problem is with the document