Bug #30342 Equality propagation can make optimization much slower
Submitted: 9 Aug 2007 18:52 Modified: 7 Oct 2015 14:54
Reporter: Mark Callaghan Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.0.37, 5.6.17 OS:Any
Assigned to: CPU Architecture:Any
Tags: equality, Optimizer, performance, propagation
Triage: Needs Triage: D2 (Serious)

[9 Aug 2007 18:52] Mark Callaghan
Description:
For queries like:

select * from t1, t2 where t1.a = t2.b and t2.b in (<large-in-list>)

equality propagation can enable the use of the t2.b in (<large-in-list>) predicates to be used for an index access path on t1.a assuming there is an index on t1.a. This can result in many calls to handler::records_in_range for t1.a. When the handler does RPCs to a remote server (as NDB might and as custom storage engines do), then the optimizer can spend too many seconds determining a query plan because of predicates that were not in the query as written.

I think that the optimizer should have be more restrained when creating predicates that weren't in the query as written. Mikael Ronstrom suggested that the NDB engine might display this problem because.

How to repeat:
Create two tables with engine=NDB with an index on t1.a as described above.
[9 Aug 2007 18:54] Mark Callaghan
This originated with a customer bug, but I won't provide a testcase for that as it requires a storage engine for which handler::records_in_range is expensive, and I don't use NDB.
[25 Feb 2014 18:40] Sveta Smirnova
Thank you for the report.

Verified as described.

To repeat:

CREATE TABLE `t1` (
  `a` int(11) DEFAULT NULL,
  KEY `a` (`a`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 ;

CREATE TABLE `t2` (
  `b` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

insert into t1 values(1),(2),(3),(4),(5),(6);
insert into t1 values(1),(2),(3),(4),(5),(6);
insert into t1 values(1),(2),(3),(4),(5),(6);
insert into t1 values(1),(2),(3),(4),(5),(6);
insert into t1 values(1),(2),(3),(4),(5),(6);

insert into t2 values(1),(2),(3),(4),(5),(6);
insert into t2 values(11),(12),(13),(14),(15),(16);

Then connect to MySQL server using gdb and set breakpoint in ha_myisam::records_in_range. Then run query select * from t1, t2 where t1.a = t2.b and t2.b in (1,2,3,4,5,6,11,12,13,14,15,16); and count how many times records_in_range will be called.
[7 Oct 2015 14:54] Mark Callaghan
Hello, anybody home?