MySQL Bugs: #39266: Optimizer to use UNION for multiple table OR condition

Bug #39266	Optimizer to use UNION for multiple table OR condition
Submitted:	5 Sep 2008 4:43	Modified:	8 Sep 2008 8:20
Reporter:	Jonathon Coombes	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S4 (Feature request)
Version:		OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	optmizer, performance, UNION

Description:
The optimizer does not use indexes when there is an OR condition on multiple tables where non-indexed fields are accessed in all tables.

How to repeat:
Tested with 5.1.26 and 6.0.6 on Linux.

##### TEST SETUP #####

DROP TABLE IF EXISTS `testA`, `testB`, `testC`;

CREATE TABLE `testA` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`value` varchar(20) NOT NULL,
`stamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `value` (`value`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

-- insert 10 values
INSERT INTO `testA` (`value`) VALUES (rand()), (rand()), (rand()), (rand()), (rand()), (rand()), (rand()), (rand()), (rand()), (rand());
-- insert 100K more
INSERT INTO `testA` (`value`) SELECT rand() FROM `testA` a1 CROSS JOIN `testA` a2 CROSS JOIN `testA` a3 CROSS JOIN `testA` a4 CROSS JOIN `testA` a5;

-- make a copy with new random values
CREATE TABLE `testB` LIKE `testA`;
INSERT INTO `testB` (`value`) SELECT rand() FROM `testA`;

-- different structured table
CREATE TABLE `testC` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`value1` varchar(20) NOT NULL,
`value2` varchar(20) NOT NULL,
`stamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `value1` (`value1`),
KEY `value2` (`value2`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

-- insert 10 values
INSERT INTO `testC` (`value1`, `value2`) VALUES (rand(), rand()), (rand(), rand()), (rand(), rand()), (rand(), rand()), (rand(), rand()), (rand(), rand()), (rand(), rand()), (rand(), rand()), (rand(), rand()), (rand(), rand());

-- insert 100K more
INSERT INTO `testC` (`value1`, `value2`) SELECT rand(), rand() FROM `testC` a1 CROSS JOIN `testC` a2 CROSS JOIN `testC` a3 CROSS JOIN `testC` a4 CROSS JOIN `testC` a5;

##### TEST #####

mysql> EXPLAIN SELECT * FROM `testA` JOIN `testB` ON `testA`.`id`=`testB`.`id` WHERE `testA`.`value` like '0.123%' or `testB`.`value` like '0.123%'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: testB
type: ALL
possible_keys: PRIMARY,value
key: NULL
key_len: NULL
ref: NULL
rows: 100285
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: testA
type: eq_ref
possible_keys: PRIMARY,value
key: PRIMARY
key_len: 4
ref: test.testB.id
rows: 1
Extra: Using where

Suggested fix:
The fix is to use a UNION statement as such:

mysql> EXPLAIN SELECT * FROM `testA` WHERE `testA`.`value` like '0.123%' UNION SELECT * FROM `testB` WHERE `testB`.`value` like '0.123%'\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: testA
type: range
possible_keys: value
key: value
key_len: 22
ref: NULL
rows: 106
Extra: Using index condition; Using MRR
*************************** 2. row ***************************
id: 2
select_type: UNION
table: testB
type: range
possible_keys: value
key: value
key_len: 22
ref: NULL
rows: 95
Extra: Using index condition; Using MRR
*************************** 3. row ***************************
id: NULL
select_type: UNION RESULT
table: <union1,2>
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra: 

However, the optimizer should be able to do this rather than the user rewriting the query all the time.

Thank you for a reasonable feature request.