Bug #72892 Please allow multiple ranges in the LIMIT clause
Submitted: 5 Jun 2014 16:27 Modified: 9 Jun 2014 9:39
Reporter: Roland Bouman Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: CPU Architecture:Any

[5 Jun 2014 16:27] Roland Bouman
Description:
This is about statistical sampling. A specific case is quantiles (like percentiles, quartiles etc). 

Suppose we want to 1st quartile, 2nd quartile and 3rd quartile. This is a special case of sampling where the sample values are 25% of the observations apart. 

We can do this by first calculating the COUNT, divide that by 4 (number of quantiles) and then do:

(SELECT amount FROM payment ORDER BY amount LIMIT 4012, 1)
UNION ALL
(SELECT amount FROM payment ORDER BY amount LIMIT 8024, 1)
UNION ALL
(SELECT amount FROM payment ORDER BY amount LIMIT 12036, 1)

Unfortunately, MySQL will have to perform the expansive ORDER multiple times.
I propose syntax like this:

SELECT amount 
FROM payment 
ORDER BY amount 
LIMIT (4012, 1), (8024, 1), (12036, 1)

Alternatively, this could be solved in the optimizer by analyzing the query and deducing that only one table scan (and ORDER BY) is required.

How to repeat:
See description

Suggested fix:
Please implement option to specify multiple ranges for LIMIT, or make the optimizer smarter so it doesn't perform the scan (and ORDER BY) multiple times
[5 Jun 2014 16:43] Roland Bouman
For many sampling purposes a STEP clause to the LIMIT clause (so as to repeatedly walk through a window of rows before selecting them) would work too, but the enumeration is more powerful since it is more specific.
[9 Jun 2014 9:39] MySQL Verification Team
Hello Roland,

Thank you for the feature request!

Thanks,
Umesh