Bug #54110 Extra rows returned on JOIN of two tables with join_cache_level 5 to 8
Submitted: 31 May 2010 15:49 Modified: 1 Jun 2010 11:07
Reporter: John Embretsen Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:bzr_next-mr-opt-backporting OS:Any
Assigned to: CPU Architecture:Any
Tags: join_cache_level, optimizer_switch

[31 May 2010 15:49] John Embretsen
Description:
When two servers with 

SET SESSION optimizer_join_cache_level = <5|6|7|8>;

execute a query in the form:

SELECT table2 .`col_varchar_1024_latin1_key`  
  FROM F table1  
    LEFT  JOIN N table2  ON table1 .`col_int_key`  = table2 .`col_int_key`

where one server uses code from mysql-next-mr-opt-backporting and the other uses code from mysql-6.0-codebase-bugfixing, the result set from the backporting server has a lot more rows than the result set from 6.0-codebase.

A complete diff will be attached as a separate file.

The issue is not present when optimizer_join_cache_level is set to 1, 2, 3 or 4.

How to repeat:
1. Start MySQL Server with InnoDB enabled.
2. Run the following statements in a client window:

SET SESSION optimizer_join_cache_level = 8;

DROP TABLE /*! IF EXISTS */ F;
DROP TABLE /*! IF EXISTS */ N;

CREATE TABLE `F` (
  `col_int` int(11) DEFAULT NULL,
  `col_varchar_1024_latin1_key` varchar(1024) DEFAULT NULL,
  `col_int_key` int(11) DEFAULT NULL,
  `pk` int(11) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`pk`),
  KEY `test_idx` (`col_int_key`,`col_int`,`pk`) USING HASH
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=latin1;

INSERT INTO `F` VALUES (4,'u',NULL,1);
INSERT INTO `F` VALUES (1160183808,'string0',NULL,3);
INSERT INTO `F` VALUES (-502202368,'oh',6,5);
INSERT INTO `F` VALUES (NULL,'can',NULL,7);
INSERT INTO `F` VALUES (3,'q',NULL,9);

CREATE TABLE `N` (
  `col_int_key` int(11) DEFAULT NULL,
  `col_varchar_1024_latin1_key` varchar(1024) DEFAULT NULL,
  KEY `col_int_key` (`col_int_key`),
  KEY `col_varchar_1024_latin1_key` (`col_varchar_1024_latin1_key`(767))
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO `N` VALUES (NULL,'m');
INSERT INTO `N` VALUES (NULL,'p');
INSERT INTO `N` VALUES (NULL,'now');
INSERT INTO `N` VALUES (NULL,'TSUNV');
INSERT INTO `N` VALUES (NULL,'string1');
INSERT INTO `N` VALUES (NULL,'how');
INSERT INTO `N` VALUES (NULL,'OYPEN');
INSERT INTO `N` VALUES (NULL,'q');
INSERT INTO `N` VALUES (NULL,'d');
INSERT INTO `N` VALUES (NULL,'tell');
INSERT INTO `N` VALUES (NULL,'string2');
INSERT INTO `N` VALUES (NULL,'y');
INSERT INTO `N` VALUES (NULL,'k');
INSERT INTO `N` VALUES (NULL,'your');
INSERT INTO `N` VALUES (NULL,'ok');
INSERT INTO `N` VALUES (NULL,'like');
INSERT INTO `N` VALUES (NULL,'UILXF');
INSERT INTO `N` VALUES (NULL,'my');
INSERT INTO `N` VALUES (NULL,'VPZDI');
INSERT INTO `N` VALUES (NULL,'t');
INSERT INTO `N` VALUES (6,'as');
INSERT INTO `N` VALUES (NULL,'string3');
INSERT INTO `N` VALUES (NULL,'when');
INSERT INTO `N` VALUES (NULL,'I');
INSERT INTO `N` VALUES (NULL,'string4');
INSERT INTO `N` VALUES (NULL,'string5');
INSERT INTO `N` VALUES (NULL,'at');
INSERT INTO `N` VALUES (NULL,'his');

SELECT table2 .`col_varchar_1024_latin1_key`  
  FROM F table1  
    LEFT  JOIN N table2  ON table1 .`col_int_key`  = table2 .`col_int_key`;

------------------

Results will differ between the server branches.
Full MTR test case will be attached.
[31 May 2010 15:53] John Embretsen
MTR test file which shows this bug

Attachment: bug54110.test (application/octet-stream, text), 2.59 KiB.

[31 May 2010 16:00] John Embretsen
Diff between servers when running attached test case.

Attachment: bug54110.diff (text/plain), 1.11 KiB.

[1 Jun 2010 9:12] John Embretsen
Here is what seems to be a closer to minimal repro for this issue. In this case the results change if the query is repeated:

DROP TABLE /*! IF EXISTS */ t1;
DROP TABLE /*! IF EXISTS */ t2;

CREATE TABLE t1 (
  `t1_col_varchar_1024` varchar(1024),
  `t1_col_int` int
) ENGINE=InnoDB;

INSERT INTO t1 VALUES ('a',NULL);
INSERT INTO t1 VALUES ('b',6);

CREATE TABLE t2 (
  `t2_col_int_key` int,
  `t2_col_varchar_1024` varchar(1024),
  KEY `col_int_key` (`t2_col_int_key`)
) ENGINE=InnoDB;

INSERT INTO t2 VALUES (NULL,'m');
INSERT INTO t2 VALUES (NULL,'n');
INSERT INTO t2 VALUES (6,'o');
INSERT INTO t2 VALUES (NULL,'p');

SELECT * FROM t1 LEFT  JOIN t2 ON t1 .`t1_col_int` = t2.`t2_col_int_key`;
SELECT * FROM t1 LEFT  JOIN t2 ON t1 .`t1_col_int` = t2.`t2_col_int_key`;

------------------------

When executed against the backporting branch (JCL=8), the results are (I added EXPLAIN before each query, notice the difference):

mysql> EXPLAIN SELECT * FROM t1 LEFT  JOIN t2 ON t1 .`t1_col_int` = t2.`t2_col_int_key`;
+----+-------------+-------+------+---------------+-------------+---------+--------------------+------+-------------------+
| id | select_type | table | type | possible_keys | key         | key_len | ref                | rows | Extra             |
+----+-------------+-------+------+---------------+-------------+---------+--------------------+------+-------------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL        | NULL    | NULL               |    2 |                   |
|  1 | SIMPLE      | t2    | ref  | col_int_key   | col_int_key | 5       | test.t1.t1_col_int |    1 | Using join buffer |
+----+-------------+-------+------+---------------+-------------+---------+--------------------+------+-------------------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM t1 LEFT  JOIN t2 ON t1 .`t1_col_int` = t2.`t2_col_int_key`;
+---------------------+------------+----------------+---------------------+
| t1_col_varchar_1024 | t1_col_int | t2_col_int_key | t2_col_varchar_1024 |
+---------------------+------------+----------------+---------------------+
| a                   |       NULL |           NULL | m                   |
| a                   |       NULL |           NULL | n                   |
| b                   |          6 |              6 | o                   |
| a                   |       NULL |           NULL | p                   |
+---------------------+------------+----------------+---------------------+
4 rows in set (0.00 sec)

mysql> EXPLAIN SELECT * FROM t1 LEFT  JOIN t2 ON t1 .`t1_col_int` = t2.`t2_col_int_key`;
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    2 |                                |
|  1 | SIMPLE      | t2    | ALL  | col_int_key   | NULL | NULL    | NULL |    4 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM t1 LEFT  JOIN t2 ON t1 .`t1_col_int` = t2.`t2_col_int_key`;
+---------------------+------------+----------------+---------------------+
| t1_col_varchar_1024 | t1_col_int | t2_col_int_key | t2_col_varchar_1024 |
+---------------------+------------+----------------+---------------------+
| b                   |          6 |              6 | o                   |
| a                   |       NULL |           NULL | NULL                |
+---------------------+------------+----------------+---------------------+
2 rows in set (0.01 sec)

I leave it to someone else to confirm what is the correct result.
[1 Jun 2010 9:28] John Embretsen
This may be related to Bug#52636 (6.0 allowing JOINs on NULL values w/ optimizer_join_cache_level = 5-8), yet to be confirmed.
[1 Jun 2010 11:07] Guilhem Bichot
The correct results, verified with reasoning and with 5.1, are:
- for the original testcase (with F and N tables)
NULL
NULL
NULL
NULL
as
- for the "closer to minimal" testcase:
a NULL NULL NULL
b 6   6     o
- indeed mysql-next-mr-opt-backporting shows wrong results (too many rows).
- the patch of BUG#52636 solves this (gives correct results for all join_cache_levels from 0 to 8).
Marking as duplicate of BUG#52636.