Bug #46211 SQL_BIG_RESULT, SQL_SMALL_RESULT, SQL_BUFFER_RESULT should be mutually exclusive
Submitted: 15 Jul 2009 18:16 Modified: 15 Jul 2009 21:46
Reporter: Paul Dubois Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Parser Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Triage: Triaged: D3 (Medium)

[15 Jul 2009 18:16] Paul Dubois
Description:
This report is a follow-up to Bug#34943, filed for the purpose
of obtaining information from a developer, and possibly causing
a change to the SELECT parser.

http://dev.mysql.com/doc/refman/6.0/en/select.html
describes the effect of three SELECT options that
all relate to the use of temporary tables:

* SQL_BIG_RESULT can be used with GROUP BY or DISTINCT to tell the
  optimizer that the result set has many rows. In this case, MySQL
  directly uses disk-based temporary tables if needed, and prefers
  sorting to using a temporary table with a key on the GROUP BY
  elements.

* SQL_BUFFER_RESULT forces the result to be put into a temporary
  table. This helps MySQL free the table locks early and helps in
  cases where it takes a long time to send the result set to the
  client. This option can be used only for top-level SELECT statements,
  not for subqueries or following UNION.

* SQL_SMALL_RESULT can be used with GROUP BY or DISTINCT to tell the
  optimizer that the result set is small. In this case, MySQL uses
  fast temporary tables to store the resulting table instead of using
  sorting. This should not normally be needed.

Question: All three of these options can be given in the same
statement. Shouldn't they be mutually exclusive? (Similar to
the way that SQL_CACHE and SQL_NO_CACHE have been made mutually
exclusive; see Bug#35020.)

If so, it is a bug that they are not mutually exclusive.

If not (i.e., they are not mutually exclusive), what is the
effect of specifying these options in combination? Undefined?
Or is there some specific meaning to combining them?

How to repeat:
N/A
[15 Jul 2009 21:46] Miguel Solorzano
Thank you for the bug report.
[10 Aug 2009 10:10] Martin Hansson
IMHO:
It would not make sense to mix SQL_BIG_RESULT with SQL_SMALL_RESULT. However, both of these should be allowed in combination with SQL_BUFFER_RESULT as they address different issues/bottlenecks.

SQL_[BIG|SMALL]_RESULT tells the optimizer how to compute groups or remove duplicates (GROUP BY and DISTINCT constructs, respectively), whereas SQL_BUFFER_RESULT seems to have the intention of alleviating communication bottlenecks between server and client, hence they should be useful even in combination. 

Example:

SELECT SQL_BUFFER_RESULT SQL_SMALL_RESULT MAX(s1.price), 
(SELECT hires_image FROM sales 
 WHERE date = s1.date and branch = s1.branch and price = max(s1.price))
FROM sales s1 
GROUP BY s1.date, s1.branch
HAVING MAX(price) > 100;

Computing the group might not require that much memory, but sending the result to a client over a narrow bandwidth connection might take long.
[10 Aug 2009 12:53] Paul Dubois
Martin, thanks for the info.

What is the effect if SQL_BIG_RESULT with SQL_SMALL_RESULT are given together? Does one take precedence? Is the effect "undefined"?
[19 May 2010 13:03] Martin Hansson
Hi Paul,

I did some investigation and I agree that intuitively it seems a result can't
be big and small at the same time. The names are perhaps not the best. When
you look into it is not really a contradiction.

- SQL_BIG_RESULT

In general a GROUP BY is performed in two steps, a temporary table and a
sorting step. There are a bunch of special cases such as reading from an
already-sorted index, computing groups on the fly, ORDER BY NULL and what not
but they are all exceptions.

This flag will turn off the temporary table step and instead do the bulk of
the work in the sorting step. This usually involves swapping out to disk (aka
filesort).

A funny side-effect that could be optimized away is that filesort is _always_
used, even in the special case SELECT pk_col FROM t1 GROUP BY pk_col.

- SQL_SMALL_RESULT

This tells the optimizer that _if_ we are using temporary tables, we should
try to fit it in memory first. It does not influence whether we use filesort
or not.

The conclusion is that the names are a bit misleading but once you know what
they actually mean, it is rather obivous that SQL_BIG_RESULT takes precedence
over SQL_SMALL_RESULT. Better names would probably be "Prefer Filesort over
Grouping Tables" and "Try to fit Grouping Tables in Memory".