Bug #97349 | Sort-merge optimization missing | ||
---|---|---|---|
Submitted: | 23 Oct 2019 17:41 | Modified: | 24 Oct 2019 13:03 |
Reporter: | Alessio Bogon | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S4 (Feature request) |
Version: | 8.0.18 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[23 Oct 2019 17:41]
Alessio Bogon
[24 Oct 2019 12:32]
MySQL Verification Team
Hello Mr. Bogon, Thank you for your feature request. I am one of the co-authors of the filesort code in MySQL, that was written back in the second half of 1990s. I assure you that our algorithm is a variant of the sort-merge. It was based on the couple of scientific articles, that were written back then. It was also adjusted to enable usage of the permanent storage for the large result sets. Hence, not a bug.
[24 Oct 2019 12:49]
Alessio Bogon
Thank you for your answer. I see that I made I typo for Query A. The query I meant is the following ("asc" instead of "desc"): explain select id from test where a in (1, 5) order by id asc; I am under the impression that "filesort" will still create a temporary table, insert the data collected, and then resort it, am I correct? My point is that it could be completely avoided. My real world problem is that a query like this: select id from test where a in (1, 5) order by id asc limit 1000 takes a very long time to execute (around 10 seconds), while it could be completed in milliseconds if the dataset was not being exported to a separate location and then reordered. For more insights, my actual table contains 40000000 records, and the cardinality of test_a_index is 263679.
[24 Oct 2019 13:03]
Alessio Bogon
Another insight that may be helpful: the number of records of that query without limit is around 3600000, which are spread all across the table select count(*) from test where a in (1, 5) Result (circa): 3600000
[24 Oct 2019 13:23]
MySQL Verification Team
Hi, By default, the file that is created is of 0 bytes length. If you wish that entire sort-merge is done in the memory, there are configuration variables for that purpose, which are described in our Reference Manual. Do note that these are local buffers, so when you set that variable, multiply it by 100, in case that 100 connections do sort-merge in the same time. Check that you have that much RAM available. This feature request is closed.