Bug #72417 execute semi join get incorrect result.
Submitted: 22 Apr 2014 7:04
Reporter: chao tang tang Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server Severity:S4 (Feature request)
Version: OS:Linux
Assigned to: CPU Architecture:Any
Tags: semi join optimize

[22 Apr 2014 7:04] chao tang tang
Description:
In ordinary we perform subquery sql , we think that the first executing subqueries, generate some intermediate data, and supply outer query.

mysql> select * from t1;
+------+-----------+
| id   | name      |
+------+-----------+
|    1 | shell     |
|    2 | kate      |
|    3 | cooc      |
|    4 | tiny      |
|    5 | prise     |
|    6 | uv button |
|    7 | 7 ad      |
+------+-----------+
7 rows in set (0.00 sec)

mysql> select * from t2;
+------+
| id   |
+------+
|    5 |
|    7 |
+------+
2 rows in set (0.00 sec)

mysql> select name from t1 where name in (select name from t2 group by name);
+-----------+
| name      |
+-----------+
| shell     |
| kate      |
| cooc      |
| tiny      |
| prise     |
| uv button |
| 7 ad      |
+-----------+
7 rows in set (0.01 sec)
NOTICE: No warning, no error.

Execute explain extended, return warnings.
mysql> explain extended select name from t1 where name in (select name from t2 group by name);
+----+--------------------+-------+------+---------------+------+---------+------+------+----------+---------------------------------+
| id | select_type        | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                           |
+----+--------------------+-------+------+---------------+------+---------+------+------+----------+---------------------------------+
|  1 | PRIMARY            | t1    | ALL  | NULL          | NULL | NULL    | NULL |    7 |   100.00 | Using where                     |
|  2 | DEPENDENT SUBQUERY | t2    | ALL  | NULL          | NULL | NULL    | NULL |    2 |   100.00 | Using temporary; Using filesort |
+----+--------------------+-------+------+---------------+------+---------+------+------+----------+---------------------------------+
2 rows in set, 2 warnings (0.00 sec)

mysql> show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                     |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1276 | Field or reference 'test.t1.name' of SELECT #2 was resolved in SELECT #1                                                                                                                                                                                    |
| Note  | 1003 | select `test`.`t1`.`name` AS `name` from `test`.`t1` where <in_optimizer>(`test`.`t1`.`name`,<exists>(select `test`.`t1`.`name` from `test`.`t2` group by `test`.`t1`.`name` having (<cache>(`test`.`t1`.`name`) = <ref_null_helper>(`test`.`t1`.`name`)))) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

I test on 5.5 & 5.6.

Fix problem, change execute query to 
mysql> select t1.name from t1 where t1.name in (select t2.name from t2 group by t2.name);
ERROR 1054 (42S22): Unknown column 't2.name' in 'field list'
report error 1054

How to repeat:
1.  create test table t1, t2, insert some data.
mysql> create table t1(id int, name varchar(20));
Query OK, 0 rows affected (0.02 sec)

mysql> create table t2(id int);
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t1 values(1, 'shell');
Query OK, 1 row affected (0.01 sec)

mysql> insert into t1 values(2, 'kate');
Query OK, 1 row affected (0.01 sec)

mysql> insert into t1 values(3, 'cooc');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1 values(4, 'tiny');
Query OK, 1 row affected (0.01 sec)

mysql> insert into t1 values(5, 'prise');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1 values(6, 'uv button');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1 values(7, '7 ad');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t2 values(5);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t2 values(7);
Query OK, 1 row affected (0.00 sec)

mysql> select name from t1 where name in (select name from t2 group by name);
+-----------+
| name      |
+-----------+
| shell     |
| kate      |
| cooc      |
| tiny      |
| prise     |
| uv button |
| 7 ad      |
+-----------+
7 rows in set (0.01 sec)

execute query: select name from t1 where name in (select name from t2 group by name);

Suggested fix:
report a error to client user.
[24 Apr 2014 11:00] Hartmut Holzgraefe
the "name" in your subquery resolves to t1.name

that is expected behavior IMHO as the subquery inherits the outer queries namespace
[24 Apr 2014 11:03] Hartmut Holzgraefe
PostgreSQL works the same way, so pretty sure not a bug