Bug #111851 Spatial index not used for join when field is used in MBRIntersects call
Submitted: 22 Jul 2023 22:59 Modified: 26 Jul 2023 6:54
Reporter: Liam Meck Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: GIS Severity:S2 (Serious)
Version:8.0.32, 8.0.34, 5.7.43 OS:Any (Linux & Mac)
Assigned to: CPU Architecture:Any (x86 & ARM)

[22 Jul 2023 22:59] Liam Meck
Description:
The reproduction is easiest to understand!

In some cases, when using `force index` for a spatial index, it will not use the index. The explain plan does show it in `possible_keys`. Specifically, this seems to happen when a field is used to build the arguments to `MBRIntersects` (or other spatial functions). In the following example, `l.id` (see repro).

`MBRIntersects(activeRange, f_tsToLineString(l.id, '2023-01-01', '2023-02-01'))`

If `l.id` is replaced with a constant, or the query includes something like `where l.id = $N`, it will use the index.

I tried to distill the repro as much as I could, but it seems to need a particular set of circumstances. I played around with other spatial functions, and without the custom function, and saw similar behavior.

How to repeat:
create table `location` (
  `id` int not null auto_increment,
  primary key (`id`)
);

-- Table representing time ranges of activity for `location`.
create table `locationActive` (
  `id` int not null auto_increment,
  `location_id` int not null,
  `start` datetime,
  `end` datetime,
  `activeRange` linestring not null /*!80003 srid 0 */,
  primary key (`id`),
  spatial key `sx_activeRange` (`activeRange`),
  constraint foreign key (`location_id`) references `location` (`id`)
);

-- Convert to coordinates like: (Location ID, Min time seconds), (Location ID, Max time seconds)
create function f_tsToLineString($Id int, $MinTS datetime, $MaxTS datetime)
returns LineString deterministic
begin
    return ST_LineStringFromText(concat('linestring(', $Id, ' ', to_seconds($MinTS), ', ', $Id, ' ', to_seconds($MaxTS), ')'));
end;

insert into location values (1), (2);
insert into locationActive values
    (1, 1, '2020-01-01', '2020-02-01', f_tsToLineString(1, '2020-01-01', '2020-02-01')),
    (2, 2, '2020-01-01', '2020-02-01', f_tsToLineString(2, '2020-01-01', '2020-02-01'));

-- Here, the index `sx_activeRange` will not be used, even though it's forced.
explain
select * from
    location l
    join locationActive lc force index (sx_activeRange) on
        mbrintersects(activeRange, f_tsToLineString(l.id, '2023-01-01', '2023-02-01'))
-- where l.id = 1
;
/*
+--+-----------+-----+----------+-----+--------------+-------+-------+----+----+--------+----------------------------------------------+
|id|select_type|table|partitions|type |possible_keys |key    |key_len|ref |rows|filtered|Extra                                         |
+--+-----------+-----+----------+-----+--------------+-------+-------+----+----+--------+----------------------------------------------+
|1 |SIMPLE     |l    |null      |index|null          |PRIMARY|4      |null|2   |100     |Using index                                   |
|1 |SIMPLE     |lc   |null      |ALL  |sx_activeRange|null   |null   |null|2   |100     |Range checked for each record (index map: 0x2)|
+--+-----------+-----+----------+-----+--------------+-------+-------+----+----+--------+----------------------------------------------+
 */

-- The index is used if we include a filter on `l.id`.
explain
select * from
    location l
    join locationActive lc force index (sx_activeRange) on
        mbrintersects(activeRange, f_tsToLineString(l.id, '2023-01-01', '2023-02-01'))
where l.id = 1
;
/*
+--+-----------+-----+----------+-----+--------------+--------------+-------+-----+----+--------+-----------+
|id|select_type|table|partitions|type |possible_keys |key           |key_len|ref  |rows|filtered|Extra      |
+--+-----------+-----+----------+-----+--------------+--------------+-------+-----+----+--------+-----------+
|1 |SIMPLE     |l    |null      |const|PRIMARY       |PRIMARY       |4      |const|1   |100     |Using index|
|1 |SIMPLE     |lc   |null      |range|sx_activeRange|sx_activeRange|34     |null |2   |100     |Using where|
+--+-----------+-----+----------+-----+--------------+--------------+-------+-----+----+--------+-----------+
 */
[25 Jul 2023 19:04] Liam Meck
Bumping severity because there's no other workaround that I'm aware of for this type of query (i.e. range on multiple datetime fields).
[26 Jul 2023 6:54] MySQL Verification Team
Hello Liam Meck,

Thank you for the report and test case.

regards,
Umesh