Bug #53637 Wrong query result when semi-join & partitions are employed
Submitted: 14 May 2010 5:48 Modified: 2 Jun 2010 14:29
Reporter: Evgeny Potemkin Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:next-mr-opt-backporting OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: optimizer_switch, semijoin

[14 May 2010 5:48] Evgeny Potemkin
Description:
Test case for bug#24186 produces wrong result when semi-join is enabled.

SELECT t1.id AS MyISAM_part
FROM t1
WHERE t1.id IN (
SELECT distinct id
FROM t4
WHERE taken BETWEEN @f_date AND date_add(@t_date, INTERVAL 1 DAY))
ORDER BY t1.id;
MyISAM_part
16421
19092
-22589
+19092

EXPLAIN EXTENDED SELECT t1.id AS MyISAM_part
FROM t1
WHERE t1.id IN (
SELECT distinct id
FROM t4
WHERE taken BETWEEN @f_date AND date_add(@t_date, INTERVAL 1 DAY))
ORDER BY t1.id;
id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	PRIMARY	t4	index	PRIMARY,taken	PRIMARY	12	NULL	10	100.00	Using where; Using index; Using temporary; Using filesort; LooseScan
1	PRIMARY	t1	eq_ref	PRIMARY	PRIMARY	4	test.t4.id	1	100.00	Using index
Warnings:
Note	1003	select `test`.`t1`.`id` AS `MyISAM_part` from `test`.`t1` semi join (`test`.`t4`) where ((`test`.`t1`.`id` = `test`.`t4`.`id`) and (`test`.`t4`.`taken` between (@f_date) and <cache>(((@t_date) + interval 1 day)))) order by `test`.`t1`.`id`

How to repeat:
CREATE TABLE t2 (
  taken datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  id int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (id,taken),
  KEY taken (taken)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO t2 VALUES 
('2006-09-27 21:50:01',16421),
('2006-10-02 21:50:01',16421),
('2006-09-27 21:50:01',19092),
('2006-09-28 21:50:01',19092),
('2006-09-29 21:50:01',19092),
('2006-09-30 21:50:01',19092),
('2006-10-01 21:50:01',19092),
('2006-10-02 21:50:01',19092),
('2006-09-27 21:50:01',22589),
('2006-09-29 21:50:01',22589);

CREATE TABLE t1 (
  id int(8) NOT NULL,
  PRIMARY KEY (id)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO t1 VALUES 
(16421),
(19092),
(22589);

CREATE TABLE t4 (
  taken datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  id int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (id,taken),
  KEY taken (taken)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 
PARTITION BY RANGE (to_days(taken)) 
(
PARTITION p01 VALUES LESS THAN (732920) , 
PARTITION p02 VALUES LESS THAN (732950) , 
PARTITION p03 VALUES LESS THAN MAXVALUE ) ;

INSERT INTO t4 select * from t2;

set @f_date='2006-09-28';
set @t_date='2006-10-02';

SELECT t1.id AS MyISAM_part
FROM t1
WHERE t1.id IN (
    SELECT distinct id
    FROM t4
    WHERE taken BETWEEN @f_date AND date_add(@t_date, INTERVAL 1 DAY))
ORDER BY t1.id;

EXPLAIN EXTENDED SELECT t1.id AS MyISAM_part
FROM t1
WHERE t1.id IN (
SELECT distinct id
FROM t4
WHERE taken BETWEEN @f_date AND date_add(@t_date, INTERVAL 1 DAY))
ORDER BY t1.id;
[14 May 2010 5:50] Evgeny Potemkin
6.0 by default uses first-match strategy for this query and produces correct result.
[2 Jun 2010 14:29] Øystein Grøvlen
This is a duplicate of Bug#52092.