Bug #79406 Extremely poor selection performance compared to MySQL 5.6.27 on InnoDB & MyISAM
Submitted: 25 Nov 2015 9:49 Modified: 3 Nov 2017 15:51
Reporter: Sebastian Hönel Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: GIS Severity:S2 (Serious)
Version:5.7.9 GA, 5.7.11 OS:Any
Assigned to: CPU Architecture:Any
Tags: performance, regression, Spatial indexing

[25 Nov 2015 9:49] Sebastian Hönel
Description:
In MySQL server 5.7.9 GA the selection-performance on spatial data is very poor. A query on a table that uses the spatial functions ST_CONTAINS or ST_WITHIN to check each record against a POINT takes 154 times longer than on different versions of MySQL 5.6 (tested: 5.6.27 and 5.6.24). The column's type which the check was ran against was GEOMETRY and it has a SPATIAL KEY. MySQL 5.6 returns the correct results nearly instantly whereas 5.7.9 takes 8-15 seconds.

The data used for this test came from OpenStreetMap. It contains the boundary information for ~300,000 places on different administrative levels.

This issue is present on both MyISAM and InnoDB.

This issue was first described here: http://dba.stackexchange.com/questions/121841/mysql-5-7-innodb-is-extremely-slow-on-spatia...

How to repeat:
Create the following table:

CREATE TABLE `area` (
  `area_id` bigint(20) NOT NULL AUTO_INCREMENT,
  `boundary` geometry NOT NULL
  PRIMARY KEY (`area_id`)
  SPATIAL KEY `boundary` (`boundary`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

---

Import this data (821mb, ~311k rows, zipped): https://drive.google.com/file/d/0B5_uvSaA5pYmZk1LQ2JkMnFzcm8/view?usp=sharing

---

Run the following query:

set @p = st_GeomFromText('POINT(14.822664 56.858916)');
select area_id from area where st_within(@p, boundary);

---

the correct result for this query is:

| area_id |
| ------- |
|     209 |
|     299 |
|    2417 |
|   26621 |
[25 Nov 2015 11:53] Sebastian Hönel
Although the result of the query ran on either system is the same, the execution times are not. Results on my local machine with Windows 10 x64, Intel Core i7 3720QM and 16GB Ram are:

| MySQL 5.6.24-log (x86/Win32) | MySQL 5.7.9-log (x86_64/Win64) |
| ---------------------------- | ------------------------------ |
|                   0.0553 sec |                     8.5342 sec |

Which is a decrease of speed of factor 154.
[25 Nov 2015 12:52] MySQL Verification Team
Thank you for the bug report. Could you please provide the output of both 5.6 and 5.7 servers of: show create table area;. Thanks.
[25 Nov 2015 12:54] MySQL Verification Team
Sorry disregard my prior feedback request. I noticed you are comparing MyISAM on 5.6 and InnoDB on 5.7.
[25 Nov 2015 12:58] Sebastian Hönel
I used the exact same table-dump on both MySQL versions and tried MyISAM on 5.6 and 5.7 as well as InnoDB on 5.7 (because there is no spatial indexing support on InnoDB in versions prior to 5.7). The poor results on 5.7 happened regardless of the DB-engine.
[25 Nov 2015 13:52] MySQL Verification Team
I couldn't repeat the issue you are reporting testing on same I7 laptop btw on my side 5.7 was faster than 5.6:

Query OK, 0 rows affected (0.00 sec)

mysql> set @p = st_GeomFromText('POINT(14.822664 56.858916)');
Query OK, 0 rows affected (0.03 sec)

mysql> select area_id from area where st_within(@p, boundary);
+---------+
| area_id |
+---------+
|     209 |
|     299 |
|    2417 |
|   26621 |
+---------+
4 rows in set (30.40 sec)

mysql> show variables like "%version%";
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 5.6.27                       |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.6.27-log                   |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Win64                        |
+-------------------------+------------------------------+
7 rows in set (0.00 sec)

mysql> exit
Bye

c:\dbs>57

c:\dbs>c:\dbs\5.7\bin\mysql -uroot -p --port=3570 --prompt="mysql 5.7 > "
Enter password: ******
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 5.7.10 Source distribution PULL: 2015-NOV-07

Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.7 > use geo
Database changed
mysql 5.7 > set @p = st_GeomFromText('POINT(14.822664 56.858916)');
Query OK, 0 rows affected (0.00 sec)

mysql 5.7 > select area_id from area where st_within(@p, boundary);
+---------+
| area_id |
+---------+
|   26621 |
|     299 |
|    2417 |
|     209 |
+---------+
4 rows in set (9.98 sec)

mysql 5.7 > show variables like "%version%";
+-------------------------+---------------------------------------+
| Variable_name           | Value                                 |
+-------------------------+---------------------------------------+
| innodb_version          | 5.7.10                                |
| protocol_version        | 10                                    |
| slave_type_conversions  |                                       |
| version                 | 5.7.10                                |
| version_comment         | Source distribution PULL: 2015-NOV-07 |
| version_compile_machine | x86_64                                |
| version_compile_os      | Win64                                 |
+-------------------------+---------------------------------------+
7 rows in set (0.00 sec)

mysql 5.7 >
[25 Nov 2015 14:14] Sebastian Hönel
Your results are very different from mine. I repeated it and attached the results to this comment. However, since I get results instantly with 5.6.27 (few msecs) I think that every response over 1 second is not acceptable (btw. I have an SSD). So your results (30.40 sec for 5.6.27 and 9.98 sec for 5.7.10) are also really slow. Another user on StackOverflow (see link in original bug description) confirmed the slow response times as well.

---- Test #1 (5.7.9 with the area-table as InnoDB):

mysql> show variables like '%version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 5.7.9                        |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.7.9-log                    |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | AMD64                        |
| version_compile_os      | Win32                        |
+-------------------------+------------------------------+
7 rows in set (0.00 sec)

mysql> set @p = st_GeomFromText('POINT(14.822664 56.858916)');
Query OK, 0 rows affected (0.00 sec)

mysql> use geo_inno;
Database changed

mysql> select area_id from area where st_within(@p, boundary);
+---------+
| area_id |
+---------+
|     209 |
|     299 |
|   26621 |
|    2417 |
+---------+
4 rows in set (14.09 sec)

---- Test #2 (5.7.9 with the area-table as MyISAM):

mysql> show variables like '%version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 5.7.9                        |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.7.9-log                    |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | AMD64                        |
| version_compile_os      | Win32                        |
+-------------------------+------------------------------+
7 rows in set (0.00 sec)

mysql> use geo;
Database changed

mysql> set @p = st_GeomFromText('POINT(14.822664 56.858916)');
Query OK, 0 rows affected (0.00 sec)

mysql> select area_id from area where st_within(@p, boundary);
+---------+
| area_id |
+---------+
|   26621 |
|     299 |
|    2417 |
|     209 |
+---------+
4 rows in set (13.84 sec)

---- Test #3 (5.6.27 with the area-table as MyISAM):

mysql> show variables like '%version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 5.6.27                       |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.6.27-log                   |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Win64                        |
+-------------------------+------------------------------+
7 rows in set (0.00 sec)

mysql> use geo;
Database changed

mysql> set @p = st_GeomFromText('POINT(14.822664 56.858916)');
Query OK, 0 rows affected (0.00 sec)

mysql> select area_id from area where st_within(@p, boundary);
+---------+
| area_id |
+---------+
|   26621 |
|     299 |
|    2417 |
|     209 |
+---------+
4 rows in set (0.07 sec)

----

I tried the last test twice with restarting the DB in between to be sure that caching was not responsible for the excellent result-time.
[30 Nov 2015 10:52] MySQL Verification Team
Thank you for the feedback.
Observed performance drop in 5.7 compared to 5.6.

Thanks,
Umesh
[17 Dec 2015 16:13] Davide Radice
Almost 10" on Mysql 5.7.10 (Homebrew) on Mac OS X 10.11.2:

mysql> set @p = st_GeomFromText('POINT(14.822664 56.858916)');
Query OK, 0 rows affected (0,03 sec)

mysql> select area_id from area where st_within(@p, boundary);
+---------+
| area_id |
+---------+
|     209 |
|     299 |
|   26621 |
|    2417 |
+---------+
4 rows in set (9,94 sec)

mysql> SHOW VARIABLES LIKE 'version%';
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| version                 | 5.7.10   |
| version_comment         | Homebrew |
| version_compile_machine | x86_64   |
| version_compile_os      | osx10.11 |
+-------------------------+----------+
4 rows in set (0,00 sec)
[24 Oct 2016 13:07] MySQL Verification Team
http://bugs.mysql.com/bug.php?id=83166 marked as duplicate of this one.
[3 Nov 2017 15:51] Paul DuBois
Posted by developer:
 
Fixed in 8.0.2.

Argument suitability checking was improved for these spatial
functions that test geometry relationships: ST_Contains(),
ST_Crosses(), ST_Disjoint(), ST_Equals(), ST_Intersects(),
ST_Overlaps(), ST_Touches(), ST_Within(), MBRContains(),
MBRCoveredBy(), MBRCovers(), MBRDisjoint(), MBREquals(),
MBRIntersects(), MBROverlaps(), MBRTouches(), MBRWithin().

As a consequence of this work, performance of spatial relation
functions was improved for cases where both arguments contain
geometries of one dimensionality; that is, where the
GeometryCollection arguments (or one GeometryCollection and another
type of geometry) passed to a spatial relation function can be
reduced to a MultiPoint, MultiLineString, or MultiPolygon. In those
cases, the more complex general GeometryCollection can be avoided.