Bug #42808 | Execution Plan Cache | ||
---|---|---|---|
Submitted: | 12 Feb 2009 22:17 | Modified: | 13 Feb 2009 5:25 |
Reporter: | Mikiya Okuno | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 4.1, 5.0, 5.1 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[12 Feb 2009 22:17]
Mikiya Okuno
[13 Feb 2009 5:25]
Valeriy Kravchuk
Thank you for the reasonable feature request. I hope it will be implemented some day, as an option. Note also that you may use prepared statements to get a "local" cache for the session.
[9 Mar 2009 12:29]
Sergey Petrunya
Request taken. It's more complicated than it seems though as "normalization" destroys certain opportunities for the optimizer. If we get SELECT * FROM tbl WHERE key1=const1 AND key2=const2 then the optimizer will need the values of const1 and const2 to make an informed decision about whether it should pick key1, key2, or neither of them. Normalization will cause us to blindly pick only one of the query plans. One easy way to tell which constants need normalization and which don't is to assume that prepared statement parameters are normalized and everything else isn't. This will also allow not to re-run the optimizer for prepared statement re-executions (and that's what other DBMS do and what a good number of people may expect).
[9 Mar 2009 12:32]
Sergey Petrunya
Mikiya, [Check if the problem can be solved by means other than QEP cache] Do you have a sample of queries which cause the optimizer to be slow? (If yes could you please provide it) Is it definitely known that the problem is in the optimizer being slow? What is the storage engine?
[10 Mar 2009 6:18]
MySQL Verification Team
Hi Sergey, Thank you for your comment. I have an idea to handle such a situation which normalization will cause us to blindly pick only one of the query plans. You know, mysqldumpslow CLI normalizes queries like below: SELECT * FROM tbl WHERE key1=N AND key2=S N stands for number literals, and S stands for string literals. If these literals are mapped to several categories, we can normalize queries without dropping category information like below: SELECT * FROM tbl WHERE key1=N1 AND key2=S1 If many categories exist, digit after N or S may vary widely. How many patterns does the optimizer identify literals? Does it depend? I agree with prepared statement scenario, as it is why I filed this feature request :) I do not have any specific query which cause the optimizer be slow. In general, optimizer may be slow when many tables are joined, complex subqueries are used and so forth. On the other hand, normalization and cache mechanism may take certain time, so cache should be done only if the optimizer takes long. While it is difficult to determine a definite formula to determine if it is slow, but it is relatively easy to have a parameter to give a hint to the optimizer, like optimizer_search_depth. The new parameter name could be optimizer_cache_depth or similar. And it controls if the optimizer makes use of QEP cache or not. There will be a relation like below: 0 < optimizer_cache_depth < optimizer_search_depth If such that parameter, optimizer_cache_depth, exists, we can have larger optimizer_search_depth because queries which has search depth larger than optimizer_cache_depth will be cached. Kind regards!
[10 Mar 2009 15:11]
Eric Bergen
This seems like a solution looking for a problem. It's probably not worth spending time on until there is some empirical evidence of the optimizer being slow and showing that a cache can actually speed it up. I'm a bit worried that having a query plan cache will introduce unnecessary locking which may make single query executions faster but may also make mysql slower when trying to scale on multiple cores because of lock contention.