Bug #107738 Spatial indexes not being used in JOIN
Submitted: 2 Jul 2022 17:29 Modified: 19 Sep 2022 22:07
Reporter: John Carew Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:8.0.28 OS:Any
Assigned to: CPU Architecture:Any

[2 Jul 2022 17:29] John Carew
Description:
When querying using a join between two tables, that have a point column(houses) in one and a geometry column(schools) in the other, mysql will not use the spatial index unless you use a primary key to match with table houses.

How to repeat:
CREATE TABLE houses (
  ID bigint UNSIGNED NOT NULL COMMENT 'Unique identifier.',
  Latitude float NOT NULL COMMENT 'The latitude of the house in degrees.',
  Longitude float NOT NULL COMMENT 'The longitude of the house in degrees.',
  ELEMENTARY_ID varchar(200) NOT NULL DEFAULT '',

  geo point GENERATED ALWAYS AS (ST_SRID(point(`Longitude`, `Latitude`),4326)) STORED NOT NULL SRID 4326,

  PRIMARY KEY (ID),
  SPATIAL INDEX spidx_house(geo),
  INDEX idx_Elementary(ELEMENTARY_ID) COMMENT 'Used for school matching.'
)
ENGINE = INNODB,
CHARACTER SET utf8mb4,
COLLATE utf8mb4_0900_ai_ci;

CREATE TABLE schools (
  GEOMETRY geometry NOT NULL SRID 4326 COMMENT 'The multipolygon geometry.',
  GEO_ID varchar(18) DEFAULT NULL COMMENT 'Linking key',
  GEO_NAME varchar(100) DEFAULT NULL COMMENT 'Area Name',

  created datetime NOT NULL DEFAULT (UTC_TIMESTAMP()),
  modified datetime NOT NULL DEFAULT (UTC_TIMESTAMP()),

  SPATIAL INDEX spidx_schools(GEOMETRY)
)
ENGINE = INNODB,
CHARACTER SET utf8mb4,
COLLATE utf8mb4_0900_ai_ci;

Query that uses spatial index spidx_schools.
EXPLAIN
SELECT * 
FROM houses h
    JOIN schools s ON ST_WITHIN(h.geo, s.GEOMETRY)
WHERE h.ID = 1234;

Query that uses no spatial index.
EXPLAIN
SELECT * 
FROM houses h
    JOIN schools s ON ST_WITHIN(h.geo, s.GEOMETRY)
WHERE h.ELEMENTARY_ID = '';
[4 Jul 2022 12:23] MySQL Verification Team
Hi Mr. Carew,

Thank you for your bug report.

What you are describing could be expected behaviour.

It is quite possible that second condition in the WHERE clause returns a very small number of rows, so that spatial index does not need to be used.

Simply, let us know the number of rows in your table "houses" for both conditions. Also, send us the output from both EXPLAIN statement, since we do not have a repeatable test case from you.
[5 Aug 2022 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[10 Sep 2022 19:16] John Carew
Our table holds 147 million records. Each query would usually contain over 50,000 records if no query limit was added via WHERE or LIMIT clause. So, no, it is not based on possibility of second where clause causing the rows to be very little. If that where the case, then there would be no need me to investigate the query's poor performance and notice that it is not using the spatial index. ;-)
[19 Sep 2022 12:12] MySQL Verification Team
Hi,

There are two possibilities why indices are not used.

First one is that one of your tables was created in 5.7 and after you upgraded to 8.0, you forgot to drop / create all spatial indices.

Second possibility is that column h.ELEMENTARY_ID is empty in very small number of cases.

In either case, this is a performance problem that you should solve yourself. Run EXPLAIN and EXPLAIN ANALYSE on the queries and try to find what is your problem.

This is definitely not a bug and this forum is there only for true bugs with fully repeatable test cases.
[19 Sep 2022 22:07] John Carew
There's also a third reason, a bug in your software...

This IS a bug, unless you think that ~5,000 rows having the same ELEMENTARY_ID is "very small number of cases". ELEMENTARY_ID was equal to blank in the example only, it is using a real value in our actual queries...

In both cases, your assumpion is wrong... We have only been on MySQL 8, starting with 8.0.19. As well as the spatial indices have been rebuilt several times. And the ELEMENTARY_ID field always has around 5,000 rows associated with them, so not a "very small number of cases".

I opened this case because when running a query like the example, it takes so long we have to kill the query. You are also missing the issue, the issue is not with the houses query. The issue is the join is not using the spatial index to find all the schools that the house belongs to.
[20 Sep 2022 12:16] MySQL Verification Team
Hi,

You have not replied to us regarding what EXPLAIN EXTENDED and EXPLAIN ANALYZE returned to you ....

Let us know if that has helped you solve your problem.