Bug #99999 EXPLAIN FORMAT=TREE does not show cost/rows for semijoin materialization
Submitted: 26 Jun 2020 7:58 Modified: 29 Jun 2020 10:46
Reporter: Øystein Grøvlen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.20 OS:Any
Assigned to: CPU Architecture:Any

[26 Jun 2020 7:58] Øystein Grøvlen
Description:
EXPLAIN FORMAT=TREE does not show cost/rows for "Nested loop inner join" nodes when one of the input is a materialized table:

mysql> explain format=tree select * from t1 where a in (select a from t11)\G
*************************** 1. row ***************************
EXPLAIN: -> Nested loop inner join
    -> Table scan on <subquery2>
        -> Materialize with deduplication
            -> Table scan on t11  (cost=1.05 rows=8)
    -> Single-row index lookup on t1 using PRIMARY (a=`<subquery2>`.a)  (cost=0.35 rows=1)

A corresponding inner join will show cost/rows:

mysql> explain format=tree select * from t1 join t11 where t1.a = t11.a\G
*************************** 1. row ***************************
EXPLAIN: -> Nested loop inner join  (cost=3.85 rows=8)
    -> Table scan on t11  (cost=1.05 rows=8)
    -> Single-row index lookup on t1 using PRIMARY (a=t11.a)  (cost=0.26 rows=1)

How to repeat:
Using a test case from subquery_sj.inc:

CREATE TABLE t1 (
  a int(11) NOT NULL,
  b int(11) NOT NULL,
  c datetime default NULL,
  PRIMARY KEY  (a),
  KEY idx_bc (b,c)
);

INSERT INTO t1 VALUES 
(406989,67,'2006-02-23 17:08:46'), (150078,67,'2005-10-26 11:17:45'),
(406993,67,'2006-02-27 11:20:57'), (245655,67,'2005-12-08 15:59:08'),
(406994,67,'2006-02-27 11:26:46'), (256,67,NULL),
(398341,67,'2006-02-20 04:48:44'), (254,67,NULL),(1120,67,NULL),
(406988,67,'2006-02-23 17:07:22'), (255,67,NULL),
(398340,67,'2006-02-20 04:38:53'),(406631,67,'2006-02-23 10:49:42'),
(245653,67,'2005-12-08 15:59:07'),(406992,67,'2006-02-24 16:47:18'),
(245654,67,'2005-12-08 15:59:08'),(406995,67,'2006-02-28 11:55:00'),
(127261,67,'2005-10-13 12:17:58'),(406991,67,'2006-02-24 16:42:32'),
(245652,67,'2005-12-08 15:58:27'),(398545,67,'2006-02-20 04:53:13'),
(154504,67,'2005-10-28 11:53:01'),(9199,67,NULL),(1,67,'2006-02-23 15:01:35'),
(223456,67,NULL),(4101,67,NULL),(1133,67,NULL),
(406990,67,'2006-02-23 18:01:45'),(148815,67,'2005-10-25 15:34:17'),
(148812,67,'2005-10-25 15:30:01'),(245651,67,'2005-12-08 15:58:27'),
(154503,67,'2005-10-28 11:52:38');

create table t11 select * from t1 where b = 67 AND (c IS NULL OR c > NOW()) order by 3 asc;

explain format=tree select * from t1 where a in (select a from t11);
explain format=tree select * from t1 join t11 where t1.a = t11.a;
[26 Jun 2020 12:19] MySQL Verification Team
Hi Mr. Oystein Grovlen,

Thank you for your bug report.

This is what I got from the EXPLAINs :

--------------------------------------

EXPLAIN
-> Nested loop inner join\n    -> Table scan on <subquery2>\n        -> Materialize with deduplication\n            -> Table scan on t11  (cost=1.05 rows=8)\n    -> Single-row index lookup on t1 using PRIMARY (a=`<subquery2>`.a)  (cost=0.35 rows=1)\n
EXPLAIN
-> Nested loop inner join  (cost=3.85 rows=8)\n    -> Table scan on t11  (cost=1.05 rows=8)\n    -> Single-row index lookup on t1 using PRIMARY (a=t11.a)  (cost=0.26 rows=1)\n

--------------------------------------

Hence, I repeated the behaviour that you reported.

Verified as reported.
[29 Jun 2020 10:46] Steinar Gunderson
Fixed as part of WL#14061:

EXPLAIN
-> Nested loop inner join  (cost=2.10 rows=8)
    -> Table scan on <subquery2>  (cost=1.60 rows=8)
        -> Materialize with deduplication  (cost=1.05 rows=8)
            -> Table scan on t11  (cost=1.05 rows=8)
    -> Single-row index lookup on t1 using PRIMARY (a=`<subquery2>`.a)
(cost=0.35 rows=1)
[29 Jun 2020 13:11] MySQL Verification Team
Thank you, Steinar.