Bug #97299 Improve the explain informations for Hash Joins
Submitted: 20 Oct 2019 16:08 Modified: 17 Jan 0:35
Reporter: Tibor Korocz Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S4 (Feature request)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[20 Oct 2019 16:08] Tibor Korocz
Description:
Hi, 

There is a new feature in MySQL 8.0.18 called Hash Joins, while I was testing it I noticed a few things which could be improved. 

1) Even if hash join is used the traditional explain says "Using join buffer (Block Nested Loop)". 
Not every one who is using MySQL reading the release notes and they wont know there is a hash join now and they should use "explain format=tree". 
I would recommend that if Hash Join used the traditional explain should say Hash Join as well.

2) If we disable hash join with a hint or optimizer the "explain format=tree" output gives a message which is not very useful: "<not executable by iterator executor>" I would recommend a better message here which actually tells hash join is disabled or something... 

How to repeat:
CREATE TABLE `t1` (
  `id` int(11) NOT NULL AUTO_INCREMENT ,
  `c1` int(11) NOT NULL DEFAULT '0',
  `c2` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `idx_c1` (`c1`)
) ENGINE=InnoDB;

CREATE TABLE `t2` (
  `id` int(11) NOT NULL AUTO_INCREMENT ,
  `c1` int(11) NOT NULL DEFAULT '0',
  `c2` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  KEY `idx_c1` (`c1`)
) ENGINE=InnoDB;

insert into t1 values (1,1,1),(2,2,2),(3,3,3),(4,4,4);
insert into t2 values (1,1,1),(2,2,2),(3,3,3),(4,4,4);

mysql> explain format=tree select count(*) from t1 join t2 on t1.c2 = t2.c2;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Aggregate: count(0)
    -> Inner hash join (t2.c2 = t1.c2)  (cost=2.50 rows=4)
        -> Table scan on t2  (cost=0.09 rows=4)
        -> Hash
            -> Table scan on t1  (cost=0.65 rows=4)
 |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain select count(*) from t1 join t2 on t1.c2 = t2.c2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4
     filtered: 25.00
        Extra: Using where; Using join buffer (Block Nested Loop)
2 rows in set, 1 warning (0.00 sec)

Explain is lying here and says "Nested Loop"

==================

mysql> SET optimizer_switch='hash_join=off';
Query OK, 0 rows affected (0.00 sec)

mysql> explain format=tree select count(*) from t1 join t2 on t1.c2 = t2.c2;
+----------------------------------------+
| EXPLAIN                                |
+----------------------------------------+
| <not executable by iterator executor>
 |
+----------------------------------------+
1 row in set (0.00 sec)

mysql> SET optimizer_switch='hash_join=on';
Query OK, 0 rows affected (0.00 sec)

mysql> explain format=tree  select /*+ NO_HASH_JOIN (t1,t2) */ count(*) from t1 join t2 on t1.c2 = t2.c2;
+----------------------------------------+
| EXPLAIN                                |
+----------------------------------------+
| <not executable by iterator executor>
 |
+----------------------------------------+
1 row in set (0.00 sec)

The message is not very helpful. 

Suggested fix:
- Explain and "explain format=tree" should give us the same information , at the moment even if the query is using Hash Join normal explain is lying and says "Nested Loop"

- And please improve the message if Hash Join is disabled.
[21 Oct 2019 6:20] MySQL Verification Team
Hello Tibor Korocz,

Thank you for the feature request!

regards,
Umesh
[28 Oct 2019 14:50] Steinar Gunderson
Posted by developer:
 
Hi,

Traditional EXPLAIN cannot show iterators. The solution is to stop using traditional EXPLAIN (it will eventually go away).

As for if you force hash join off, you will get BNL, thus the pre-iterator executor. EXPLAIN FORMAT=tree is entirely right in that this query cannot be executed by the iterator executor. BNL will also go away entirely at some point, at which point this hint will be ignored.
[29 Oct 2019 9:31] Georgi Sotirov
See also https://bugs.mysql.com/bug.php?id=97280 ;-)
[17 Jan 0:35] Jon Stephens
Documented fix as follows in the MySQL 8.0.20 changelog:

    EXPLAIN now uses -hash join- instead of -Block Nested Loop-,
    since the latter no longer exists and is replaced by a hash
    join in nearly all cases.

Closed.