Bug #96311 ST_Intersects() is very slow on MySQL 8.0
Submitted: 24 Jul 2019 9:02 Modified: 19 Feb 2020 17:38
Reporter: Yoshiaki Yamasaki Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: GIS Severity:S5 (Performance)
Version:8.0.17 OS:Any
Assigned to: CPU Architecture:Any
Tags: gis

[24 Jul 2019 9:02] Yoshiaki Yamasaki
Description:
ST_Intersects() is very slow on MySQL 8.0. (Using Spatial index is slower than not using Spatial index on MySQL 8.0)

[Example]
- ST_Intersects() with Spatial index    : 5.11 sec
- ST_Intersects() without Spatial index : 3.84 sec

----------------------------------------
mysql> SHOW VARIABLES LIKE 'version';
+---------------+--------+
| Variable_name | Value  |
+---------------+--------+
| version       | 8.0.17 |
+---------------+--------+
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 WHERE ST_Intersects(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );
+----+-------------+--------------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
| id | select_type | table        | partitions | type  | possible_keys | key   | key_len | ref  | rows | filtered | Extra       |
+----+-------------+--------------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | h27ka23_utf8 | NULL       | range | SHAPE         | SHAPE | 34      | NULL |    1 |   100.00 | Using where |
+----+-------------+--------------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 WHERE ST_Intersects(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );
+---------+-----------+----------+
| OGR_FID | x_code    | y_code   |
+---------+-----------+----------+
|    1162 | 136.86591 | 35.17593 |
<<snip>>
+---------+-----------+----------+
23 rows in set (5.11 sec)

mysql> EXPLAIN SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 IGNORE INDEX(SHAPE) WHERE ST_Intersects(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );
+----+-------------+--------------+------------+------+---------------+------+---------+------+-------+----------+-------------+
| id | select_type | table        | partitions | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra       |
+----+-------------+--------------+------------+------+---------------+------+---------+------+-------+----------+-------------+
|  1 | SIMPLE      | h27ka23_utf8 | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 12834 |   100.00 | Using where |
+----+-------------+--------------+------------+------+---------------+------+---------+------+-------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 IGNORE INDEX(SHAPE) WHERE ST_Intersects(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );
+---------+-----------+----------+
| OGR_FID | x_code    | y_code   |
+---------+-----------+----------+
|    1038 | 136.86393 | 35.17555 |
<<snip>>
+---------+-----------+----------+
23 rows in set (3.84 sec)

----------------------------------------
[Note]
- I think "ST_Intersects() = ST_Within() + ST_Overlaps()". Therefore, I checked ST_Within() and ST_Overlaps() also.
  - ST_Within() is fast. It is no problem.
  - ST_Overlaps() is slow. It is a problem.

[Example]
- ST_Within() with Spatial index    : 0.00 sec
- ST_Within() without Spatial index : 3.02 sec

- ST_Overlaps() with Spatial index    : 4.70 sec
- ST_Overlaps() without Spatial index : 3.20 sec

----------------------------------------

How to repeat:
1. Import dump file.  * I'll upload later.

2. Execute following SQLs.
----------------------------------------
[ST_Intersects]

EXPLAIN SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 WHERE ST_Intersects(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 WHERE ST_Intersects(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

EXPLAIN SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 IGNORE INDEX(SHAPE) WHERE ST_Intersects(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 IGNORE INDEX(SHAPE) WHERE ST_Intersects(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

----------------------------------------
[ST_Within]

EXPLAIN SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 WHERE ST_Within(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 WHERE ST_Within(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

EXPLAIN SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 IGNORE INDEX(SHAPE) WHERE ST_Within(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 IGNORE INDEX(SHAPE) WHERE ST_Within(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

----------------------------------------
[ST_Overlaps]

EXPLAIN SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 WHERE ST_Overlaps(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 WHERE ST_Overlaps(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

EXPLAIN SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 IGNORE INDEX(SHAPE) WHERE ST_Overlaps(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 IGNORE INDEX(SHAPE) WHERE ST_Overlaps(SHAPE,ST_GeomFromText('POLYGON((35.175 136.860, 35.170 136.860, 35.170 136.865, 35.175 136.865, 35.175 136.860))', 4612) );

----------------------------------------

Suggested fix:
Improve performance of ST_Intersects() and ST_Overlaps().
[24 Jul 2019 9:25] Yoshiaki Yamasaki
Regarding dump file, I specify source according to the terms and conditions of the original file. In addition, I edited the file.

[Source]
General counter of government statistics (e-Stat) (https://www.e-stat.go.jp/)

[Note]
I edited source file(*). The following are my edit procedure.
 1. Converting charactor set from cp932 to UT-F8 using org2ogr
 2. Importing Shape file by org2ogr
 3. Adding "SRID 4612" to SHAPE column using following SQL.
    - mysql> ALTER TABLE h27ka23_utf8 DROP INDEX SHAPE;
    - mysql> ALTER TABLE h27ka23_utf8 MODIFY SHAPE GEOMETRY NOT NULL SRID 4612;
    - mysql>   ALTER TABLE h27ka23_utf8 ADD SPATIAL INDEX (SHAPE);

 * https://www.e-stat.go.jp/gis/statmap-search/data?dlserveyId=A002005212015&code=23&coordSys...
[24 Jul 2019 9:42] Yoshiaki Yamasaki
dump file (1/4)

Attachment: dump.aa (application/octet-stream, text), 2.86 MiB.

[24 Jul 2019 9:43] Yoshiaki Yamasaki
dump file (2/4)

Attachment: dump.ab (application/octet-stream, text), 2.86 MiB.

[24 Jul 2019 9:43] Yoshiaki Yamasaki
dump file (3/4)

Attachment: dump.ac (application/octet-stream, text), 2.86 MiB.

[24 Jul 2019 9:44] Yoshiaki Yamasaki
dump file (4/4)

Attachment: dump.ad (application/octet-stream, text), 1.03 MiB.

[24 Jul 2019 9:46] Yoshiaki Yamasaki
[How to combine and uncompress dump files]

cat dump.* > geotest2.sql.bz2
bunzip2 geotest2.sql.bz2
[24 Jul 2019 10:35] Yoshiaki Yamasaki
I tested same scenario on MySQL 5.7.27. It is fast as follows. I'll upload test results.

[Example on MySQL 5.7] 
- ST_Intersects() with Spatial index    : 0.00 sec
- ST_Intersects() without Spatial index : 0.81 sec

- ST_Within() with Spatial index    : 0.00 sec
- ST_Within() without Spatial index : 0.79 sec

- ST_Overlaps() with Spatial index    : 0.00 sec
- ST_Overlaps() without Spatial index : 0.88 sec

[Note]
I swapped latitude and longitude to match axis-order on MySQL 5.7.

ex)
SELECT OGR_FID, x_code, y_code FROM geotest2.h27ka23_utf8 WHERE ST_Intersects(SHAPE,ST_GeomFromText('POLYGON((136.860 35.175, 136.860 35.170, 136.865 35.170, 136.865 35.175, 136.860 35.175))', 4612) );
[24 Jul 2019 10:38] Yoshiaki Yamasaki
Test result on MySQL 8.0.17

Attachment: test_result_8.0.txt (text/plain), 12.00 KiB.

[24 Jul 2019 10:38] Yoshiaki Yamasaki
Test result on MySQL 5.7.27

Attachment: test_result_5.7.txt (text/plain), 12.09 KiB.

[24 Jul 2019 10:45] Yoshiaki Yamasaki
I found similar bug report. Bug#94655 is classified as S4(Feature request), but I think this is inappropriate. Since they can't upgrade to MySQL 8.0 due to this bug.

  Bug#94655 Some GIS function do not use spatial index anymore
  https://bugs.mysql.com/bug.php?id=94655
    => "That prevents us to upgrade to MySQL 8.0 because of this performance drop..."
[24 Jul 2019 13:47] Yoshiaki Yamasaki
If you check this problem using Shape file, please try following procedure.

1. Download Shape file from this URL.
https://www.e-stat.go.jp/gis/statmap-search/data?dlserveyId=A002005212015&code=23&coordSys...

2. Change character set from CP932 to UTF-8 using ogr2ogr

ex)
# ogr2ogr -f "ESRI Shapefile" -lco ENCODING=UTF-8 -oo ENCODING=CP932 h27ka23_utf8.shp h27ka23.shp

3. Create geotest database for importing Shape file in MySQL

ex)
mysql> create database geotest;

4. Import Shape file using ogr2ogr

ex)
# ogr2ogr.exe -f "MySQL" MySQL:"geotest,host=127.0.0.1,user=root,password=root,port=3306" h27ka23_utf8.shp

5. Add SRID to SHAPE column

ex)
mysql> use geotest;
mysql> ALTER TABLE h27ka23_utf8 DROP INDEX SHAPE;
mysql> ALTER TABLE h27ka23_utf8 MODIFY SHAPE GEOMETRY NOT NULL SRID 4612;
mysql> ALTER TABLE h27ka23_utf8 ADD SPATIAL INDEX (SHAPE);

* If we want to use Spatial Index, SRID needed.
https://dev.mysql.com/doc/refman/8.0/en/spatial-index-optimization.html
  => "For comparisons to work properly, each column in a SPATIAL index must be SRID-restricted. That is, the column definition must include an explicit SRID attribute, and all column values must have the same SRID."
[25 Jul 2019 5:13] Yoshiaki Yamasaki
There are 2 notes when using GDAL as follows.

1. You must use mysql_native_password authentication plugin, because ogr2ogr can't use caching_sha2_password plugin.

[Reference]
ogr2ogr with MySQL 8 (MySQL connect failed with caching_sha2_password plugin) #910
https://github.com/OSGeo/gdal/issues/910

libmysql-client version is old for MySQL 8.0 #132
https://github.com/gisinternals/buildsystem/issues/132

2. You must use GDAL 2.3.2 and later to specify correct SRID. I recommend using GDAL 3.0.1, because I used it. (Although details could not be confirmed, GDAL 2.4.2 may not have correctly specified SRID.)

[Reference]
- ogr2ogr with MySQL 8 (Wrong SRID) #905
https://github.com/OSGeo/gdal/issues/905
[25 Jul 2019 6:22] MySQL Verification Team
Thank you, Yoshiaki-San.
Verified as described with 8.0.17 build.

regards,
Umesh
[25 Jul 2019 6:27] MySQL Verification Team
Test results - 8.0.17

Attachment: 96311_8.0.17.results (application/octet-stream, text), 19.66 KiB.

[25 Jul 2019 9:58] MySQL Verification Team
Test results - 5.7.27

Attachment: 96311_5.7.27.results (application/octet-stream, text), 18.03 KiB.

[25 Jul 2019 12:55] MySQL Verification Team
There is a duplicate of this bug. Precisely:

https://bugs.mysql.com/bug.php?id=96269
[29 Jul 2019 13:59] Pavel Dvořák
Hi, any news on this. This bug makes my database unusable. Thanks
[19 Feb 2020 17:38] Arun Kuruvila
Fixed as part of Bug #94655