Bug #94571 The document of semi-joins is inconsistent with the server
Submitted: 6 Mar 2019 3:16 Modified: 8 Mar 2019 9:29
Reporter: Samuel Liang Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: CPU Architecture:Any

[6 Mar 2019 3:16] Samuel Liang
Description:
https://dev.mysql.com/doc/refman/8.0/en/semi-joins.html
The above document says "The statement must not use the STRAIGHT_JOIN join type in the outer query.". But I found that it still can be optimized to semi-join. 

How to repeat:
mysql> select * from t1;
+------+------+
| a    | b    |
+------+------+
|    0 |    1 |
|    1 |    2 |
|    2 |    3 |
|    3 |    4 |
|    4 |    5 |
|    5 |    6 |
|    6 |    7 |
|    7 |    8 |
+------+------+
8 rows in set (0.00 sec)

mysql> select * from t2;
+------+------+
| a    | b    |
+------+------+
|    0 |    1 |
|    1 |    2 |
|    2 |    3 |
|    3 |    4 |
+------+------+
4 rows in set (0.00 sec)

mysql> select * from t3;
+------+------+
| a    | b    |
+------+------+
|    0 |    1 |
|    2 |    3 |
|    3 |    4 |
+------+------+
3 rows in set (0.00 sec)

mysql> explain select t1.*,t2.* from t2 STRAIGHT_JOIN t1 on t1.b=t2.b where t1.a in (select a from t3);
+----+--------------+-------------+------------+--------+---------------+------------+---------+----------+------+----------+----------------------------------------------------+
| id | select_type  | table       | partitions | type   | possible_keys | key        | key_len | ref      | rows | filtered | Extra                                              |
+----+--------------+-------------+------------+--------+---------------+------------+---------+----------+------+----------+----------------------------------------------------+
|  1 | SIMPLE       | t2          | NULL       | ALL    | NULL          | NULL       | NULL    | NULL     |    4 |   100.00 | NULL                                               |
|  1 | SIMPLE       | t1          | NULL       | ALL    | NULL          | NULL       | NULL    | NULL     |    8 |    12.50 | Using where; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE       | <subquery2> | NULL       | eq_ref | <auto_key>    | <auto_key> | 5       | ldy.t1.a |    1 |   100.00 | NULL                                               |
|  2 | MATERIALIZED | t3          | NULL       | ALL    | NULL          | NULL       | NULL    | NULL     |   10 |   100.00 | NULL                                               |
+----+--------------+-------------+------------+--------+---------------+------------+---------+----------+------+----------+----------------------------------------------------+
4 rows in set, 1 warning (0.00 sec)

mysql> show warnings;
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                          |
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `ldy`.`t1`.`a` AS `a`,`ldy`.`t1`.`b` AS `b`,`ldy`.`t2`.`a` AS `a`,`ldy`.`t2`.`b` AS `b` from `ldy`.`t2` semi join (`ldy`.`t3`) straight_join `ldy`.`t1` where ((`ldy`.`t1`.`b` = `ldy`.`t2`.`b`) and (`<subquery2>`.`a` = `ldy`.`t1`.`a`)) |
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
[6 Mar 2019 14:52] MySQL Verification Team
Hi,

First of all, I do not see that the nested query is converted into semi-join, since the nested query is materialised.

Second, transformation can not occur since the IN clause is not met by a single row, which is a condition for the transformation.

Last, as our Manual states , you should run EXTENDED explain and then take a look at the warnings.
[7 Mar 2019 3:41] Samuel Liang
Hi Sinisa

I have run EXTENDED explain as above indeed and the warnings said "`ldy`.`t2` semi join (`ldy`.`t3`) straight_join `ldy`.`t1`". Doesn't that mean semijoin?

Thanks
[7 Mar 2019 14:09] MySQL Verification Team
Hi Samuel,

Thank you for your feedback. That clears out the things.

As the warning clearly puts out, straight join is between tables t1 and t2, while the semi-join is between tables t2 and t3. There is no straight_join clause on that join. 

Hence, our documentation is correct. Not a bug.
[8 Mar 2019 9:29] Samuel Liang
Hi Sinisa

Maybe I don't fully understand the documentation. So could you give me an example that could explain "The statement must not use the STRAIGHT_JOIN join type in the outer query."?

Thanks
[8 Mar 2019 13:49] MySQL Verification Team
This is actually very simple.

Our documentation clearly points how to use straight_join so that all tables are affected.