Bug #63868 Left outer join with user defined function on NULL column gives wrong results
Submitted: 28 Dec 2011 19:34 Modified: 29 Dec 2011 4:24
Reporter: Pim VAN DER WAL Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.60, 5.5.20 OS:Linux (Centos 5 x64)
Assigned to: CPU Architecture:Any
Tags: FUNCTION, LEFT OUTER JOIN, udf
Triage: Needs Triage: D2 (Serious)

[28 Dec 2011 19:34] Pim VAN DER WAL
Description:
A query with a left join and a user defined function in the where clause using a column from the outer table of the join produces the wrong results if the outer table does not have a matching row. In this case the value passed to the function should be NULL but it apparently has some indeterminate value. 

The example script below has a function that always returns 1 even if the parameter is NULL. The where clause should therefore be true for every row but in the main query no rows are returned indicating that a different value is returned. The alternative is that an error or a warning is produced but that doesn't happen either and the error also shows no message.

How to repeat:
CREATE TABLE tbl_1 (a INT) ENGINE=INNODB;
CREATE TABLE tbl_2 (b INT) ENGINE=INNODB;

INSERT INTO tbl_1 (a) VALUES (1);

DELIMITER $$
DROP FUNCTION IF EXISTS fn_1 $$
CREATE FUNCTION fn_1 (p_1 INT) RETURNS INT 
BEGIN 
RETURN 1; 
END $$
DELIMITER ;

SELECT * 
FROM tbl_1
    LEFT JOIN (SELECT b FROM tbl_2) AS tmp ON tbl_1.a = tmp.b
WHERE fn_1(tmp.b) = 1;

Does not produce any results whereas it should produce 1 row since fn_1 always produces 1 even if called with NULL. Strange enough the following where clause does produce the correct result which seems to indicate the return value is actually NULL:

WHERE FN_1(tmp.b) <=> 1;

However the following where clause does not produce any results again.

WHERE FN_1(tmp.b) IS NULL;
[29 Dec 2011 4:24] Valeriy Kravchuk
Thank you for the problem report. Verified with 5.5.20 also:

macbook-pro:5.5 openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 196
Server version: 5.5.20-debug Source distribution

Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE TABLE tbl_1 (a INT) ENGINE=INNODB;
Query OK, 0 rows affected (0.31 sec)

mysql> CREATE TABLE tbl_2 (b INT) ENGINE=INNODB;
Query OK, 0 rows affected (0.20 sec)

mysql> INSERT INTO tbl_1 (a) VALUES (1);
Query OK, 1 row affected (0.14 sec)

mysql> DELIMITER $$
mysql> DROP FUNCTION IF EXISTS fn_1 $$
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE FUNCTION fn_1 (p_1 INT) RETURNS INT 
    -> BEGIN 
    -> RETURN 1; 
    -> END $$
Query OK, 0 rows affected (0.15 sec)

mysql> DELIMITER ;
mysql> SELECT * 
    -> FROM tbl_1
    ->     LEFT JOIN (SELECT b FROM tbl_2) AS tmp ON tbl_1.a = tmp.b
    -> WHERE fn_1(tmp.b) = 1;
Empty set (0.06 sec)

mysql> SELECT *  FROM tbl_1     LEFT JOIN tbl_2 AS tmp ON tbl_1.a = tmp.b WHERE fn_1(tmp.b) = 1;
Empty set (0.04 sec)

mysql> SELECT *  FROM tbl_1     LEFT JOIN tbl_2 AS tmp ON tbl_1.a = tmp.b WHERE fn_1(tmp.b) <=> 1;
+------+------+
| a    | b    |
+------+------+
|    1 | NULL |
+------+------+
1 row in set (0.00 sec)

Here is what happens internally:

mysql> explain extended SELECT *  FROM tbl_1     LEFT JOIN tbl_2 AS tmp ON tbl_1.a = tmp.b WHERE fn_1(tmp.b) = 1;
+----+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
|  1 | SIMPLE      | tbl_1 | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where                    |
|  1 | SIMPLE      | tmp   | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
2 rows in set, 1 warning (0.10 sec)

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `test`.`tbl_1`.`a` AS `a`,`test`.`tmp`.`b` AS `b` from `test`.`tbl_1` join `test`.`tbl_2` `tmp` where ((`test`.`tmp`.`b` = `test`.`tbl_1`.`a`) and (`fn_1`(`test`.`tbl_1`.`a`) = 1))
1 row in set (0.00 sec)

So, query is (mistakenly) rewritten as JOIN instead of LEFT JOIN. Compare to this:

mysql> explain extended SELECT *  FROM tbl_1     LEFT JOIN tbl_2 AS tmp ON tbl_1.a = tmp.b WHERE fn_1(tmp.b) <=> 1;
+----+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | tbl_1 | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 |             |
|  1 | SIMPLE      | tmp   | ALL  | NULL          | NULL | NULL    | NULL |    1 |   100.00 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `test`.`tbl_1`.`a` AS `a`,`test`.`tmp`.`b` AS `b` from `test`.`tbl_1` left join `test`.`tbl_2` `tmp` on((`test`.`tmp`.`b` = `test`.`tbl_1`.`a`)) where (`fn_1`(`test`.`tmp`.`b`) <=> 1)
1 row in set (0.00 sec)