Bug #83166 | Query execution involving spatial and non-spatial data very slow | ||
---|---|---|---|
Submitted: | 27 Sep 2016 10:09 | Modified: | 11 Nov 2016 16:43 |
Reporter: | Ioannis Priggouris | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: GIS | Severity: | S5 (Performance) |
Version: | 5.7.15 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | join, slow performance, spatial |
[27 Sep 2016 10:09]
Ioannis Priggouris
[24 Oct 2016 13:06]
MySQL Verification Team
Thank you for the bug report. Duplicate/related to bug: https://bugs.mysql.com/bug.php?id=79406.
[29 Oct 2016 20:03]
Ioannis Priggouris
I cannot understand how the specific issue I reported is related to the one you suggested it is duplicate to!!!! It is evident that the bug I report has nothing to do with the performance of spatial queries or spatial functions. It is clearly a problem related to the way mysql's query analyzer performs!!! This is obvious since the same query when separated to two individual queries communicating through a temp table as I explained, performs very well.
[29 Oct 2016 20:05]
Ioannis Priggouris
I re-opened the bug for all the above
[31 Oct 2016 14:06]
MySQL Verification Team
First of all, can you explain on what do you mean by query analyzer ??? What product is it exactly ??? Second, can you demonstrate your problem without using any GIS data and function, but non-spatial data. Third, when you upload your data, please do it in .tar.gz or .zip package only.
[4 Nov 2016 5:37]
Ioannis Priggouris
Saying query analyzer, I suggest that there is a problem in the way the query is analyzed internally by mysql before being executed. Now I do know that every database has such an analyzer which analyzes complex queries in order to provide the most performance effective execution plan for it. It is clear that this is not happening here. I do not know if the problem has to do with mixing GIS data and standard sql data, but it is evident that mysql does not provide the best execution plan for the query, although I have written it in a way that dictates so (i.e., having joins providing small numbers of results first etc). On the second approach where the query is separated in two discrete ones, communicating data through a temp table, what I do, in fact, is to force the execution order of joins the way it should be. However, this should have been handled efficiently my mysql itself, on the single approach also.
[4 Nov 2016 14:31]
MySQL Verification Team
Hi, First of all, we do not have any query analyzer, but query optimizer. Second, can you demonstrate your problem without using any GIS data and function, but non-spatial data. If you can't demonstrate a problem without GIS data and functions, then this bug is a duplicate as it was already notified. Third, when you upload your data, please do it in .tar.gz or .zip package only.
[4 Nov 2016 20:54]
Ioannis Priggouris
I suggest we do not stick to words and terminology. I am not familiar with the exact naming you use for mysql internal components but I think that we both understand what I am suggesting. Now regarding your second statement, I cannot understand how you relate my bug with the suggested one! The one you are suggesting as duplicate to mine refers to different performances between mysql 5.6 and 5.7 regarding spatial queries. I do not reported anything like that. In fact my queries have exactly the same behavior and time responses in both mysql 5.6 (no spatial indices) and 5.7 (with spatial indices). I just downloaded the latest version of 5.6 and run it. So I do not accept your classification as a duplicate, unless you proove me that this is not the case. For me, it is clearly a query optimizer problem, because the second approach with the two separate queries still uses spatial functions but returns in milliseconds. So I do not see any poor performance of the spatial functions, either in 5.6 or 5.7 versions. Finally, since you insisted persistently, I uploaded a zip format of my test data, publicly available this time, so that everybody could run it and see the issue.
[4 Nov 2016 21:05]
Ioannis Priggouris
Note that I uploaded through the sftp site this time, since zipping the same data created a larger file above 3 Mb. The data inside the file, though, are exactly the same with my first upload (the rar file).
[8 Nov 2016 14:29]
MySQL Verification Team
Hi, First and first of all, we do have to stick to words and terminology. Only by using terminology that is specially developed for relational database systems, can we communicate and express problems and situations. That is why we have put our manual on-line. Entire terminology and entire functioning of our server is explained therein. Second, I guess that you have uploaded your ZIP file in the "incoming" directory. What is the exact file's name. Third, I do not see your current setup. Can you send us the exact configuration as defined in the MySQL configuration file. If you have not configured MySQL for your purposes, I would suggest that you do that first.
[8 Nov 2016 23:22]
Ioannis Priggouris
mysql configuration file
Attachment: my_test.ini (application/octet-stream, text), 13.79 KiB.
[8 Nov 2016 23:35]
Ioannis Priggouris
For the uploaded zip file I used the exact naming convention elaborated on the files tab. So the filename is mysql-bug-data-83166.zip Now I have also uploaded the mysql configuration my_test.ini, that I used for my tests. Response Time with this configuration is about 14 secs for the single query and instant response (0.000 secs) for the temp table (2 queries) approach
[9 Nov 2016 15:46]
MySQL Verification Team
We shall try to replicate your issue in due time.
[10 Nov 2016 19:46]
MySQL Verification Team
Hi, I have tested your entire test case and, indeed, the variant with temporary table executes thousands time faster then variant with dependent nested queries. This is a very well known issue since MySQL version 5.0. Please, read a manual on the subject of the dependent nested queries (some call it sub-queries). The dependent query has to be re-executed for each row in which it participates, which in your case is an inner join. Our optimizer team has a worklog entry for optimizing out these nested queries, but its schedule is yet unknown. We advise our users to do exactly what you did. Use temporary tables.
[11 Nov 2016 16:43]
Paul DuBois
GIven Sinisa's remark that this is expected behavior: Closed with no further action.