Bug #60281 Performance suggestion: optimize large and simple "LIMIT" queries
Submitted: 28 Feb 2011 14:01 Modified: 15 Jan 2013 14:37
Reporter: Moshe Lampert Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version: OS:Any
Assigned to: CPU Architecture:Any

[28 Feb 2011 14:01] Moshe Lampert
Description:
Suggestion: how to optimize queries with large LIMIT command.

How to repeat:
Simple queries like this:

explain select *  from news where news.n_visible > 0  order by n_date desc limit 30720,30

1	SIMPLE	news	range	VisibleDate	VisibleDate	1	null	41293	Using where; Using filesort

The server scan all rows from 1 to 41293 using the index.

I tested a workaround that works 99% faster:

select  n_id from news where news.n_visible > 0  order by n_date desc limit 30720,30

and
select * from news where n_id in (/* the list from the first query*/) order by n_date desc limit 30720,30

The first query is index-only query, as I have covering index on "n_date":

1	SIMPLE	news	index	VisibleDate	CatVisibleDate	17	null	31140	Using where; Using index

and the second it just small "range", as I wrote on my blog (in Hebrew);

http://blogs.microsoft.co.il/blogs/moshel/archive/2011/02/01/mysql-limit-slow-large-limit-...

Suggested fix:
Change the optimizer to look on the index first, and fetch only the used rows.  It is Very useful on Web applications, as search engines deep-looked on categories/tags/pages.
[28 Feb 2011 14:50] Valeriy Kravchuk
Please, send the output of 

SHOW CREATE TABLE news\G

for the table used.
[28 Feb 2011 15:32] Moshe Lampert
CREATE TABLE `news` (
  `n_id` int(10) NOT NULL AUTO_INCREMENT,
  `n_cat` int(3) NOT NULL DEFAULT '0',
  `n_title` varchar(140) DEFAULT NULL,
  `n_short` varchar(400) DEFAULT NULL,
  `n_date` bigint(20) NOT NULL DEFAULT '0',
  `n_image` int(11) DEFAULT '0',
  `n_user` int(11) NOT NULL DEFAULT '0',
  `n_visible` decimal(1,0) NOT NULL DEFAULT '0',
  `n_replies` int(11) NOT NULL DEFAULT '0',
  `n_author` varchar(100) NOT NULL,
  `n_LargeImage` bigint(20) NOT NULL DEFAULT '0',
  `n_album` int(10) DEFAULT '0',
  `n_upDate` bigint(20) NOT NULL DEFAULT '0',
  `n_galery` int(4) NOT NULL DEFAULT '0',
  `n_template` varchar(45) CHARACTER SET hebrew NOT NULL DEFAULT '',
  `n_imageDesc` varchar(45) NOT NULL DEFAULT '',
  `n_videox` varchar(255) DEFAULT NULL,
  `n_Content` mediumtext,
  `n_FirstUpdate` bigint(20) NOT NULL DEFAULT '0',
  `n_Special` int(11) NOT NULL DEFAULT '0',
  `n_ImageCredit` varchar(105) NOT NULL,
  `n_DispImage` int(1) NOT NULL DEFAULT '1',
  `n_related` varchar(75) NOT NULL DEFAULT '',
  `n_VideoImage` int(11) NOT NULL DEFAULT '0',
  `n_home` int(11) NOT NULL DEFAULT '0',
  `n_audio` int(1) NOT NULL DEFAULT '0',
  `n_video` int(11) NOT NULL DEFAULT '0',
  `n_largeImageCredit` varchar(105) NOT NULL,
  `n_rImage` int(11) NOT NULL DEFAULT '0',
  `n_title2` varchar(45) NOT NULL DEFAULT '',
  `n_nlsent` int(1) unsigned NOT NULL DEFAULT '0',
  `n_tags` varchar(255) DEFAULT NULL,
  `n_hasFlash` int(1) unsigned DEFAULT '0',
  `n_views` int(11) unsigned DEFAULT '0',
  PRIMARY KEY (`n_id`),
  KEY `DateCat` (`n_cat`,`n_date`),
  KEY `special` (`n_Special`,`n_date`) USING BTREE,
  KEY `cat` (`n_cat`),
  KEY `CatVisibleDate` (`n_date`,`n_cat`,`n_visible`,`n_id`) USING BTREE,
  KEY `Index_6` (`n_date`) USING BTREE,
  KEY `Visibledate` (`n_visible`) USING HASH
) ENGINE=MyISAM AUTO_INCREMENT=142581 DEFAULT CHARSET=utf8
[28 Feb 2011 16:51] Valeriy Kravchuk
Then it looks like yet another variation of bug #28404. Please, check.
[28 Feb 2011 18:28] Moshe Lampert
Maybe, but I think not, because the explain on the simple query -works, and the problem is with the devired table.
[28 Feb 2011 19:11] Valeriy Kravchuk
I do not see any derived table in the original query:

explain select *  from news where news.n_visible > 0  order by n_date desc limit
30720,30

Optimizer just uses not the best possible index for it (ignores index that can be used for ORDER BY), same as in bug #28404.

Then you try to provide a workaround with derived table, and then you see that for a subquery in your workaround, once again, improper index is used. But this is what you described, essentially, in bug #60280. You are forced (because of these 2 bugs) to use 2 queries instead of one to get results fast.

So, I think, there is nothing new in this report comparing to that other 2 bug reports mentioned above (other than just a useful workaround).
[28 Feb 2011 19:56] Moshe Lampert
Forgot...
The devired table is other bug I opened today.

I think not, because "LIMIT" implemetation scan the table and skip the first rows until limit start (from "show variables").
[15 Jan 2013 14:37] Matthew Lord
I'm marking this as a duplicate of:
  http://bugs.mysql.com/bug.php?id=28404

Please let me know if you disagree for any reason.