Bug #97297 EXPLAIN ANALYZE or format=tree output for cross join query shows hash join?
Submitted: 19 Oct 2019 13:12 Modified: 21 Oct 2019 8:15
Reporter: Sergei Petrunia Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.18 OS:Any
Assigned to: CPU Architecture:Any

[19 Oct 2019 13:12] Sergei Petrunia
Description:
I'm looking at a query that does a cross (cartesian) join. EXPLAIN ANALYZE and EXPLAIN FORMAT=TREE show that hash join is used, while tabular explain doesn't show that.

I'm not sure what can be hashed there as the query doesn't have an equality, either explicit or implied 

How to repeat:

create table ten(a int primary key);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table one_k(a int primary key);
insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;

mysql> explain select count(*) from one_k A, one_k B where B.a + A.a<1000;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                                           |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------------------------------------------------+
|  1 | SIMPLE      | A     | NULL       | index | NULL          | PRIMARY | 4       | NULL | 1000 |   100.00 | Using index                                                     |
|  1 | SIMPLE      | B     | NULL       | index | NULL          | PRIMARY | 4       | NULL | 1000 |   100.00 | Using where; Using index; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------------------------------------------------+
2 rows in set, 1 warning (0.01 sec)

# Makes sense so far.

mysql> explain analyze select count(*) from one_k A , one_k B where B.a + A.a<1000\G
*************************** 1. row ***************************
EXPLAIN: -> Aggregate: count(0)  (actual time=1049.247..1049.247 rows=1 loops=1)
    -> Filter: ((B.a + A.a) < 1000)  (cost=100101.51 rows=1000000) (actual time=8.486..790.860 rows=500500 loops=1)
        -> Inner hash join  (cost=100101.51 rows=1000000) (actual time=8.480..463.982 rows=1000000 loops=1)
            -> Index scan on B using PRIMARY  (cost=0.10 rows=1000) (actual time=0.278..15.036 rows=1000 loops=1)
            -> Hash
                -> Index scan on A using PRIMARY  (cost=100.75 rows=1000) (actual time=0.539..6.143 rows=1000 loops=1)

1 row in set (1.05 sec)

mysql> explain format=tree select count(*) from one_k A , one_k B where B.a + A.a<1000\G
*************************** 1. row ***************************
EXPLAIN: -> Aggregate: count(0)
    -> Filter: ((B.a + A.a) < 1000)  (cost=100101.51 rows=1000000)
        -> Inner hash join  (cost=100101.51 rows=1000000)
            -> Index scan on B using PRIMARY  (cost=0.10 rows=1000)
            -> Hash
                -> Index scan on A using PRIMARY  (cost=100.75 rows=1000)

Suggested fix:
Is this some kind of degenerate form of hash join that's not really a hash join?
[21 Oct 2019 5:06] MySQL Verification Team
Hello Sergei,

Thank you for the report and test case.
Verified as described.

regards,
Umesh
[21 Oct 2019 8:15] Steinar Gunderson
Posted by developer:
 
(Inner) BNL is indeed always rewritten to a hash join, whether degenerate or not. This is as expected, and not a bug. BNL will go away entirely when the pre-iterator executor does.

Tabular EXPLAIN cannot show iterators (like hash join); this is also expected. If you want to see the actual execution path in the iterator executor, you will need to use FORMAT=tree, which I suppose will become the default eventually.