Bug #7888 optimization request for query UNION query LIMIT statements...
Submitted: 13 Jan 2005 21:49 Modified: 22 Jan 2014 20:13
Reporter: Matthew Murphy Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S4 (Feature request)
Version:4.1x, 5.x, etc. OS:Linux (Linux)
Assigned to: CPU Architecture:Any
Triage: D5 (Feature request)

[13 Jan 2005 21:49] Matthew Murphy
Description:
Please consider adding the following optimization:

example:
 
SELECT * FROM mytable WHERE col LIKE '%emptyset%'
UNION
SELECT * FROM mytable WHERE col LIKE '%fastresult%'
LIMIT 1

This query takes very long to return, even though the second query takes less than a second.

In languages like C, a statement such as:

if(a || b || c){
   blah;
}

will move to the code block as soon as any of a b or c is true.  It would be nice if the mysql optimizer could execute both queries at once and return the requested row as soon as the first match returns a row.

Reversing the order of the queries in the UNION statement does not speed up the query.

How to repeat:
1) Find a table with a lot of data, and write two queries, one that takes a long time and another that executes quickly.

2) Connect the queries via a UNION, and add a LIMIT clause.

3) observe that mysql processes all rows in the first and second query before returning the one result requested, even though one result was available from the db much more quickly.

Suggested fix:
I haven't looked at the code, but perhaps when processing a UNION, if there is a LIMIT to follow  and no ORDER BY clause or GROUP BY clause that would require knowing about all of the intermediate results, simply allow either part of the intermediate result set to satisfy the LIMIT statement and allow the query to complete.

More generally, there may be a class of optimizations that could be made when the entire intermediate result set does not need to be known in order to satisfy the request made by a SELECT statement.
[12 Jun 2012 14:51] Valeriy Kravchuk
For UNION, in general case, we have to get data from both SELECTs and then remove duplicates, and only then we can apply LIMIT. We have some ideas on how to improve this, see http://forge.mysql.com/worklog/task.php?id=1762.

For UNION ALL (where "streaming" is possible) we have another WL, http://forge.mysql.com/worklog/task.php?id=1763.