Bug #37359 filesort can be more efficient
Submitted: 12 Jun 2008 0:55 Modified: 7 Nov 2011 18:40
Reporter: Neel Nadgir Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S4 (Feature request)
Version:5.1, 5.4, 6.0.5 OS:Any
Assigned to: CPU Architecture:Any
Tags: filesort, innodb, sort_buffer_size

[12 Jun 2008 0:55] Neel Nadgir
Description:
filesort() initializes space for 
MIN(Estimate rows in table, max rows that can fit in sort buffer) rows 
before doing the sort. For queries that have a "where" clause, the 
number of rows estimated by estimate_rows_upper_bound can be much 
higher than rows that match the where clause.

The initialization is done via make_char_array() in filesort.cc and is
very expensive when the estimate is off.

For a industry standard workload, the following query
SELECT COL1 FROM BAR WHERE INDEXED_COL_ID = ? ORDER BY NON_INDEXED_FOO
uses filesort. Table BAR has 210 Million rows.

Tracing the code, we see that innodb returns 210 Million for 
estimate_rows_upper_bound(). The number of rows that match 
is 3. For a 2MB sort_buffer_size (default), we end up initializing 
75,000 rows in make_char_array(), whereas we will only use 3!

We can reduce the sort_buffer_size to 32k (lowest allowable value), but
even then we end up initializing 1200 entries.

By reducing the sort_buffer_size to 32k, we see a 18% improvement in 
benchmark throughput.

How to repeat:
The benchmark is quite hard to setup. Please email for details.

Suggested fix:
Possible suggestions are
1. Passdown where clause to estimate_rows_upper_bound() so that it can
   give a better estimate
2. Dont pre-initialize.. Initialize the sort area as records are read.
[12 Jun 2008 4:23] Valeriy Kravchuk
Thank you for a reasonable feature request. I am not sure that dynamic allocation is a good solution for a general case. 

Maybe we should just add a session variable/optimizer hint to control maximum number of entires pre-allocated for cases like yours.
[12 Jun 2008 15:34] Neel Nadgir
Allocation is not the problem; initializing it is the problem.
The line that takes up the most time is 
<code>while (fields--) *(pos++) = (char_pos+= length); </code> in 
make_char_array() function.

A link to a callstack profile is at 
http://blogs.sun.com/realneel/resource/filesort_trace.svg
(Width of box is proportional to time consumed, and functions stack
up as they are caled). The trace is with a debug build (compiler 
inlines make_char_array with optimization), so the numbers will 
be a higher.
[27 Jun 2008 8:22] Igor Babaev
The function estimate_rows_upper_bound() should return a proper estimate in innodb.
[27 Jun 2008 8:23] Igor Babaev
The function estimate_rows_upper_bound() should return a proper estimate in innodb.
[27 Jun 2008 8:39] Sergei Golubchik
Igor, it does. It is defined to return the number of rows in the table.
The better estimation is disabled in filesort - search for #ifdef CAN_TRUST_RANGE.
[24 Jul 2008 7:35] James Day
One workaround for this is SHOW GLOBAL STATUS and look at Sort_merge_passes. If the value is zero or very few per minute, reduce the sort_buffer_size value until you start to see a few Sort_merge_passes each minute. The value that you are at will still probably be larger than ideal but it'll be closer than not checking. Sort merge passes are often not expensive, so don't be reluctant to experiment with lower values of sort_buffer_size.

It's also quite common for production servers to benefit from larger values, so check that as well. Also note that you can set it larger or smaller for individual connections as required for the work that they are doing.

This isn't specific to this bug report, it's general server tuning guidance. It doesn't eliminate the problem, just may reduce the size of the area being initialised and the speed penalty.
[16 Sep 2008 13:02] MySQL Verification Team
Bug http://bugs.mysql.com/bug.php?id=39401 it was marked as duplicate of this one.
[7 Aug 2009 12:24] James Day
See this blog entry for more discussion: http://blogs.sun.com/realneel/entry/improving_filesort_performance_in_mysql
[11 May 2010 12:39] MySQL Verification Team
some more recent links about tuning sort_buffer_size:

http://www.xaprb.com/blog/2010/05/09/how-to-tune-mysqls-sort_buffer_size/
http://ronaldbradford.com/blog/more-on-understanding-sort_buffer_size-2010-05-10/
http://www.pythian.com/news/12441/sort_buffer_size-and-knowing-why/
[11 May 2010 22:18] James Day
I've asked our documentation team to change the description of sort_buffer_size to:

'Each thread that needs to do a sort allocates a buffer of this size. If you see many Sort_merge_passes per second in SHOW GLOBAL STATUS output you can consider increasing this to speed up ORDER BY or GROUP BY operations that can't be improved with query optimization or improved indexing. The whole buffer is allocated even if it is not all needed, so setting it larger than required globally will slow down most queries that sort. It's best to increase it as a session setting, only for the connections that need a larger size. On Linux there are thresholds of 256k and 2M where going larger may significantly slow down memory allocation so you should consider staying below one of those values. Experiment to find the best value for your workload. See Section B.5.4.4, “Where MySQL Stores Temporary Files”.'

James Day, MySQL Senior Support Engineer, Oracle
[7 Nov 2011 18:40] Paul DuBois
Noted in 5.6.4 changelog.

The estimate of space required for filesort operations could be too
high, resulting in inefficient initialization.