Bug #58924 a LIMIT after a UNION doesn't propogate to all queries
Submitted: 14 Dec 2010 18:38 Modified: 15 Dec 2010 20:29
Reporter: Gerald Cavanaugh Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.44, 5.0, 5.1, 5.6.1 OS:Linux
Assigned to: CPU Architecture:Any
Tags: UNION LIMIT

[14 Dec 2010 18:38] Gerald Cavanaugh
Description:
This is a fairly specific optimization. In a nutshell, this:

SELECT * FROM Table1 LIMIT 10 UNION SELECT * FROM Table2 LIMIT 10;

Runs *much* faster than this:

SELECT * FROM Table1 UNION SELECT * FROM Table2 LIMIT 10;

Or even this:

(SELECT * FROM Table1) UNION (SELECT * FROM Table2) LIMIT 10;

Even though the result set of the above queries will be the same for any Table1 and Table2.

How to repeat:
1. Create 2 tables with Millions of records.
2. Try the select statements above, note the runtime of each.

Suggested fix:
I believe "LIMIT X" at the end of a series of UNION'd SELECT statements implies a "LIMIT X" on each individual statement (since no statement can return more than X rows).
[14 Dec 2010 20:27] Sveta Smirnova
Thank you for the report.

I can not fully repeat described behavior: last statement is longest one. Please send us output of SHOW CREATE TABLE for tables table1 and table2
[15 Dec 2010 15:54] Gerald Cavanaugh
Should work for anything, but here's an example that shows the performance difference (both tables filled with random data).

CREATE TABLE `Table1` (
  `col1` char(12) NOT NULL,
  `col2` int(10) unsigned DEFAULT NULL,
  `col3` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`col1`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

CREATE TABLE `Table2` (
  `col1` char(12) NOT NULL,
  `col2` int(10) unsigned DEFAULT NULL,
  `col3` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`col1`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Counts:

mysql> select count(*) as cnt from Table1;
+---------+
| cnt     |
+---------+
| 4000000 | 
+---------+

mysql> select count(*) as cnt from Table2;
+---------+
| cnt     |
+---------+
| 4000000 | 
+---------+

Fast:

mysql> SELECT * FROM Table1 LIMIT 10 UNION SELECT * FROM Table2 LIMIT 10;
+--------------+---------+---------+
| col1         | col2    | col3    |
+--------------+---------+---------+
| 100000000001 | 8978915 | 9887326 | 
| 100000000002 | 3080501 | 6311555 | 
| 100000000003 | 1277013 | 8453797 | 
| 100000000004 | 1133262 | 9823827 | 
| 100000000005 |  791268 |  949038 | 
| 100000000006 | 3740173 | 8552823 | 
| 100000000007 | 5005969 |  811450 | 
| 100000000008 | 1030309 | 1027110 | 
| 100000000009 | 5646036 | 3269681 | 
| 100000000010 | 2967100 | 7230711 | 
+--------------+---------+---------+
10 rows in set (0.50 sec)

Slow:

mysql> SELECT * FROM Table1 UNION SELECT * FROM Table2 LIMIT 10;
+--------------+---------+---------+
| col1         | col2    | col3    |
+--------------+---------+---------+
| 100000000001 | 8978915 | 9887326 | 
| 100000000002 | 3080501 | 6311555 | 
| 100000000003 | 1277013 | 8453797 | 
| 100000000004 | 1133262 | 9823827 | 
| 100000000005 |  791268 |  949038 | 
| 100000000006 | 3740173 | 8552823 | 
| 100000000007 | 5005969 |  811450 | 
| 100000000008 | 1030309 | 1027110 | 
| 100000000009 | 5646036 | 3269681 | 
| 100000000010 | 2967100 | 7230711 | 
+--------------+---------+---------+
10 rows in set (1 min 9.15 sec)

Also Slow:

mysql> (SELECT * FROM Table1) UNION (SELECT * FROM Table2) LIMIT 10;
+--------------+---------+---------+
| col1         | col2    | col3    |
+--------------+---------+---------+
| 100000000001 | 8978915 | 9887326 | 
| 100000000002 | 3080501 | 6311555 | 
| 100000000003 | 1277013 | 8453797 | 
| 100000000004 | 1133262 | 9823827 | 
| 100000000005 |  791268 |  949038 | 
| 100000000006 | 3740173 | 8552823 | 
| 100000000007 | 5005969 |  811450 | 
| 100000000008 | 1030309 | 1027110 | 
| 100000000009 | 5646036 | 3269681 | 
| 100000000010 | 2967100 | 7230711 | 
+--------------+---------+---------+
10 rows in set (2 min 39.57 sec)
[15 Dec 2010 20:29] Sveta Smirnova
Thank you for the feedback.

Results are same like in mine environment. Really mysqld should not apply limit to both queries, because it should remove duplicates from result, then apply limit, but some optimization can be done, thus bug is verified.

Test case for MTR:

create table t1(f1 int);
create table t2 (f1 int);
insert into t1 values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10);
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;

insert into t2 select * from t1;

flush status;
select now();
select * from t1 union select * from t2 limit 3;
select now();
show status like 'ha%';
flush status;
select now();
select * from t1 limit 3 union select * from t2 limit 3;
select now();
show status like 'ha%';
flush status;
select now();
(select * from t1) union (select * from t2) limit 3;
select now();
show status like 'ha%';
flush status;
[15 Dec 2010 20:32] Sveta Smirnova
Same problem with UNION ALL which surely should be optimized.