Bug #108136 NO_SEMIJOIN hint doesn't work
Submitted: 15 Aug 2022 1:58 Modified: 17 Aug 2022 3:22
Reporter: Brian Yue (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.29 OS:Any (rhel-7.4)
Assigned to: CPU Architecture:Any (x86-64)

[15 Aug 2022 1:58] Brian Yue
Description:
Hello,
  There is a case I want to forbiden SEMIJOIN optimization for a SELECT statement, but I find that it doesn't work, I have tried both in MySQL8.0.25 and MySQL8.0.29.

How to repeat:

mysql> CREATE TABLE `t1` (
    ->     `id` int DEFAULT NULL,
    ->     `age` int DEFAULT NULL,
    ->     `c1` int DEFAULT NULL,
    ->     KEY `c1` (`c1`)
    ->   ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8_bin;
  `id` int NOT NULL,
  `c1` int DEFAULT NULL,
  `c2` int DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `c1` (`c1`,`c2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8_bin;
Query OK, 0 rows affected, 2 warnings (2.18 sec)

mysql>
mysql> CREATE TABLE `t2` (
    ->   `id` int NOT NULL,
    ->   `c1` int DEFAULT NULL,
    ->   `c2` int DEFAULT NULL,
    ->   PRIMARY KEY (`id`),
    ->   KEY `c1` (`c1`,`c2`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8_bin;
Query OK, 0 rows affected, 2 warnings (1.78 sec)

mysql> insert into t1 values (1,1,1),(2,2,2),(3,3,3),(4,4,4),(5,5,5);
Query OK, 5 rows affected (0.11 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> insert into t2 values (1,1,1),(2,2,2),(3,3,3),(4,4,4),(5,5,5);
Query OK, 5 rows affected (0.05 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> insert into t2 select id+5, c1+5, c2+5 from t2;
Query OK, 5 rows affected (0.06 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> insert into t2 select id+10, c1+10, c2+10 from t2;
Query OK, 10 rows affected (0.01 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql>  insert into t2 select id+20, c1+20, c2+20 from t2;
Query OK, 20 rows affected (0.01 sec)
Records: 20  Duplicates: 0  Warnings: 0

mysql> set @@session.optimizer_trace = 'enabled=on';
Query OK, 0 rows affected (0.00 sec)

mysql> explain select /*+ NO_SEMIJOIN() */ * from t1 where c1 in (select c1 from t2) \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: NULL
         type: ALL
possible_keys: c1
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 5
     filtered: 100.00
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
   partitions: NULL
         type: ref
possible_keys: c1
          key: c1
      key_len: 5
          ref: yxx.t1.c1
         rows: 1
     filtered: 100.00
        Extra: Using index; FirstMatch(t1)
2 rows in set, 1 warning (0.00 sec)

mysql> show warnings;
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                          |
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select /*+ NO_SEMIJOIN(@`select#1`) */ `yxx`.`t1`.`id` AS `id`,`yxx`.`t1`.`age` AS `age`,`yxx`.`t1`.`c1` AS `c1` from `yxx`.`t1` semi join (`yxx`.`t2`) where (`yxx`.`t2`.`c1` = `yxx`.`t1`.`c1`) |
+-------+------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

After optimization we can see that FirstMatch strategy of semijoin is used, and the sql statement in the output of show warnings uses `semi join` keyword.
[15 Aug 2022 7:51] Øystein Grøvlen
Not that the hint needs to be applied to the subquery:

mysql> explain select * from t1 where c1 in (select /*+ NO_SEMIJOIN() */ c1 from t2);
+----+--------------------+-------+------------+----------------+---------------+------+---------+------+------+----------+--------------------------+
| id | select_type        | table | partitions | type           | possible_keys | key  | key_len | ref  | rows | filtered | Extra                    |
+----+--------------------+-------+------------+----------------+---------------+------+---------+------+------+----------+--------------------------+
|  1 | PRIMARY            | t1    | NULL       | index          | NULL          | c1   | 10      | NULL |    5 |   100.00 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | t2    | NULL       | index_subquery | c1            | c1   | 5       | func |    1 |   100.00 | Using index              |
+----+--------------------+-------+------------+----------------+---------------+------+---------+------+------+----------+--------------------------+

Or:

> explain select /*+ NO_SEMIJOIN(@subq) */ * from t1 where c1 in (select /*+ QB_NAME(subq) */c1 from t2);
+----+--------------------+-------+------------+----------------+---------------+------+---------+------+------+----------+--------------------------+
| id | select_type        | table | partitions | type           | possible_keys | key  | key_len | ref  | rows | filtered | Extra                    |
+----+--------------------+-------+------------+----------------+---------------+------+---------+------+------+----------+--------------------------+
|  1 | PRIMARY            | t1    | NULL       | index          | NULL          | c1   | 10      | NULL |    5 |   100.00 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | t2    | NULL       | index_subquery | c1            | c1   | 5       | func |    1 |   100.00 | Using index              |
+----+--------------------+-------+------------+----------------+---------------+------+---------+------+------+----------+--------------------------+
[16 Aug 2022 12:38] MySQL Verification Team
Hi Mr. Yue,

Please, let us know whether this hint works when it is applied to the nested query.

Thanks in advance.
[17 Aug 2022 3:22] Brian Yue
Hello,
  This hint does work when it is applied to the nested query.
  Thanks a lot.

  Now this bug report is closed.
[17 Aug 2022 12:20] MySQL Verification Team
Thank you, Mr. Yue.