Bug #100485 | There is no way to provide ECP row estimates for scan cost modeling | ||
---|---|---|---|
Submitted: | 10 Aug 2020 14:37 | Modified: | 11 Aug 2020 21:20 |
Reporter: | Justin Swanhart | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S4 (Feature request) |
Version: | 8.0.21 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[10 Aug 2020 14:37]
Justin Swanhart
[10 Aug 2020 15:18]
Justin Swanhart
note that t1 and t2 might not have any keys, and the evaluation of the size of the relations is completely dependent on ECP...
[10 Aug 2020 15:31]
Justin Swanhart
maybe call cond_push() without any join expressions, then ha_info could set stat->records appropriately given the cond_push() pushdown info, and then after the query plan is created call engine_push which would not have any of the expressions pushed down in cond_push() which would just be join conditions?
[10 Aug 2020 15:32]
Justin Swanhart
(usually) just be join conditions...
[10 Aug 2020 15:45]
Justin Swanhart
But that causes some weirdness too, since constant and equality propagation are determined by join order. Let me know your thoughts, as this is something I would maybe have to handle by parsing the query in a preparse rewrite plugin, using EXPLAIN ANALYZE to gather cost information that I pass in with explain_extra, then passing that info into the storage engine with a variable, etc, and finally executing the original query. That is a real mess..
[10 Aug 2020 20:24]
Justin Swanhart
Oops.. Example query should be: where t1.c2 = 5 AND t1.c3 = 6 but it could be something complex like: (a>1 and b=2) or (c=3 and d< 12) or e=13 or f is null My storage engine can evaluate those using bitmap indexes and filter the table efficiently. Regardless, even in NDB which would have to scan the table, it could provide an estimate for the scan by using histograms for example. Maybe ANALYZE TABLE should collect histogram information for all columns so that hash join can choose a proper order using statistics and so that ICP can properly estimate the cost of pushed down conditions (I filed another bug about ICP too).
[11 Aug 2020 12:18]
MySQL Verification Team
Justin, I presume that your SE can't do histograms. Also, do you handle ECP well in your SE ????
[11 Aug 2020 12:52]
MySQL Verification Team
Hi Justin, After careful deliberation, it was concluded that this is a very good feature request. Verified as a feature request.
[11 Aug 2020 21:20]
Justin Swanhart
changed synopsis