Bug #92967 Slow query times with geospetial index
Submitted: 26 Oct 2018 14:05 Modified: 8 Jan 2019 14:45
Reporter: Georgi Georgiev Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: GIS Severity:S5 (Performance)
Version:8.0.13 OS:Any
Assigned to: CPU Architecture:Any
Tags: slow performance, spatial

[26 Oct 2018 14:05] Georgi Georgiev
Description:
Running a simple query on table conaining spatial information (points with SRID 4326) and with configured spatial index takes to much time. I tried it on 8.0.12 installed on Debian machine and 8.0.13 Debian and windows and the behaviour is the same.

Tried the same approach with table with multipolygon and the times are disapointigly slow.

How to repeat:
1. Create table using:
CREATE TABLE `geo` (
	`id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
	`lat` DECIMAL(15,6) NULL DEFAULT NULL,
	`lon` DECIMAL(15,6) NULL DEFAULT NULL,
	`pt` POINT NOT NULL SRID 4326,
	PRIMARY KEY (`id`),
	SPATIAL INDEX `spatial_index` (`pt`)
)
COLLATE='utf8mb4_general_ci'
ENGINE=InnoDB
ROW_FORMAT=DYNAMIC
AUTO_INCREMENT=0
;

2. Import data from attached geo_bug_data.7z file (1,166,634 records)
3. Execute: 

SELECT id,`lat`,`lon` FROM `geo`.geo
WHERE ST_WITHIN(`pt`,ST_GeomFromText('polygon((35.1324 -2.9767, 35.1021 -2.9767, 35.1021 -2.9177, 35.1324 -2.9177, 35.1324 -2.9767))',4326));

I am getting 85,832 rows returned in ~5.6s

Executing EXPLAIN on the same query:
"1"	"SIMPLE"	"geo"	\N	"range"	"spatial_index"	"spatial_index"	"34"	\N	"43"	"100.00"	"Using where"

Optimizer_trace:
{
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `geo`.`id` AS `id`,`geo`.`lat` AS `lat`,`geo`.`lon` AS `lon` from `geo` where st_within(`geo`.`pt`,st_geomfromtext('polygon((35.1324 -2.9767, 35.1021 -2.9767, 35.1021 -2.9177, 35.1324 -2.9177, 35.1324 -2.9767))',4326))"
          }
        ]
      }
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "condition_processing": {
              "condition": "WHERE",
              "original_condition": "st_within(`geo`.`pt`,st_geomfromtext('polygon((35.1324 -2.9767, 35.1021 -2.9767, 35.1021 -2.9177, 35.1324 -2.9177, 35.1324 -2.9767))',4326))",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "st_within(`geo`.`pt`,st_geomfromtext('polygon((35.1324 -2.9767, 35.1021 -2.9767, 35.1021 -2.9177, 35.1324 -2.9177, 35.1324 -2.9767))',4326))"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "st_within(`geo`.`pt`,st_geomfromtext('polygon((35.1324 -2.9767, 35.1021 -2.9767, 35.1021 -2.9177, 35.1324 -2.9177, 35.1324 -2.9767))',4326))"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "st_within(`geo`.`pt`,st_geomfromtext('polygon((35.1324 -2.9767, 35.1021 -2.9767, 35.1021 -2.9177, 35.1324 -2.9177, 35.1324 -2.9767))',4326))"
                }
              ]
            }
          },
          {
            "substitute_generated_columns": {
            }
          },
          {
            "table_dependencies": [
              {
                "table": "`geo`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ]
              }
            ]
          },
          {
            "ref_optimizer_key_uses": [
            ]
          },
          {
            "rows_estimation": [
              {
                "table": "`geo`",
                "range_analysis": {
                  "table_scan": {
                    "rows": 1162136,
                    "cost": 120989
                  },
                  "potential_range_indexes": [
                    {
                      "index": "PRIMARY",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "spatial_index",
                      "usable": true,
                      "key_parts": [
                        "pt",
                        "id"
                      ]
                    },
                    {
                      "index": "Index 3",
                      "usable": false,
                      "cause": "not_applicable"
                    }
                  ],
                  "setup_range_conditions": [
                  ],
                  "group_index_range": {
                    "chosen": false,
                    "cause": "not_group_by_or_distinct"
                  },
                  "skip_scan_range": {
                    "potential_skip_scan_indexes": [
                      {
                        "index": "spatial_index",
                        "usable": false,
                        "cause": "query_references_nonkey_column"
                      }
                    ]
                  },
                  "analyzing_range_alternatives": {
                    "range_scan_alternatives": [
                      {
                        "index": "spatial_index",
                        "ranges": [
                          "pt unprintable_geometry_value"
                        ],
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 43,
                        "cost": 48.303,
                        "chosen": true
                      }
                    ],
                    "analyzing_roworder_intersect": {
                      "usable": false,
                      "cause": "too_few_roworder_scans"
                    }
                  },
                  "chosen_range_access_summary": {
                    "range_access_plan": {
                      "type": "range_scan",
                      "index": "spatial_index",
                      "rows": 43,
                      "ranges": [
                        "pt unprintable_geometry_value"
                      ]
                    },
                    "rows_for_plan": 43,
                    "cost_for_plan": 48.303,
                    "chosen": true
                  }
                }
              }
            ]
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "table": "`geo`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "rows_to_scan": 43,
                      "access_type": "range",
                      "range_details": {
                        "used_index": "spatial_index"
                      },
                      "resulting_rows": 43,
                      "cost": 52.603,
                      "chosen": true
                    }
                  ]
                },
                "condition_filtering_pct": 100,
                "rows_for_plan": 43,
                "cost_for_plan": 52.603,
                "chosen": true
              }
            ]
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": "st_within(`geo`.`pt`,st_geomfromtext('polygon((35.1324 -2.9767, 35.1021 -2.9767, 35.1021 -2.9177, 35.1324 -2.9177, 35.1324 -2.9767))',4326))",
              "attached_conditions_computation": [
              ],
              "attached_conditions_summary": [
                {
                  "table": "`geo`",
                  "attached": "st_within(`geo`.`pt`,st_geomfromtext('polygon((35.1324 -2.9767, 35.1021 -2.9767, 35.1021 -2.9177, 35.1324 -2.9177, 35.1324 -2.9767))',4326))"
                }
              ]
            }
          },
          {
            "refine_plan": [
              {
                "table": "`geo`"
              }
            ]
          }
        ]
      }
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
        ]
      }
    }
  ]
}
[26 Oct 2018 16:36] MySQL Verification Team
Thank for the bug report. Please attach here the file you mentioned? geo_bug_data.7z . Thanks.
[27 Oct 2018 6:10] Georgi Georgiev
Hi Miguel,

the file was bigger than 3MB so I uploaded it via sftp. The name is mysql-bug-data-92967

Sorry for not not mention it in my original message.

Georgi
[1 Nov 2018 16:25] MySQL Verification Team
Hi,

I truly do not see the problem.

GIS queries are very complex queries, their indices require lot's of memory with jumping to and fro.

I get the same output from the query as you are getting:

.......................................................
| 224190 | 35.127064 | -2.952244 |
| 224189 | 35.127064 | -2.952244 |
| 200681 | 35.127064 | -2.952244 |
| 658625 | 35.127083 | -2.952250 |
| 271123 | 35.127083 | -2.952250 |
| 224188 | 35.127083 | -2.952250 |
| 200680 | 35.127083 | -2.952250 |
| 200679 | 35.127083 | -2.952250 |
+--------+-----------+-----------+

The output for the time command for this query is quite good for me:

real	0m16.644s
user	0m0.011s
sys	0m0.007s

I am using 8 Gb for the buffer pool.

The number of rows is the same as the one that you get:

+----------+
| count(*) |
+----------+
|    85832 |
+----------+

Why do you think that 16 seconds is too long for a very complex GIS query on almost two million rows ???

Do you get much better results with some older versions and / or with some different execution route ???? I inspected the trace and it seems to me like the most optimal trace for this query.

In short, I do not see what is the performance problem ???
[2 Dec 2018 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[8 Jan 2019 5:08] seong yong kang
I have same issue with similar query.

SELECT * FROM latlng WHERE st_within(coords, ST_Envelope(linestring(point(126.98019083521763, 37.484337719502655), point(127.01848022911464, 37.505619717376504)))) LIMIT 1;

this query takes 0.01ms in mysql 5.7 but takes 0.50ms in mysql 8.0.13.

latlng table has 4644662 records and using spatial index.
[8 Jan 2019 13:52] MySQL Verification Team
When the difference is measured is milliseconds, that can be many causes. One table can be more fragmented than the other, data are further away from the spindle, etc, etc ......

Simply, a difference is too small.
[8 Jan 2019 14:45] MySQL Verification Team
Performance that I have repeated is excellent for the very large number of rows and relatively large result set.
[8 Jan 2019 15:00] seong yong kang
Although I mention 10ms above, actually less than 1ms. I can't service with 500ms delay in product. I guess it does not seem to use spatial index unlike explain results.
[8 Jan 2019 15:02] MySQL Verification Team
If EXPLAIN shows that index is used, then it is definitely used !!!