Bug #27917 Spatial (rtree) index not used with "OR" clause.
Submitted: 18 Apr 2007 7:48 Modified: 13 May 2015 8:08
Reporter: Wil Williams Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.0.40-BK, 5.0.32-Debian_3 OS:Linux
Assigned to: CPU Architecture:Any
Tags: Geometry, gis, MBRIntersects, spatial index

[18 Apr 2007 7:48] Wil Williams
Description:
In previous versions of mysql there was a major bug such that a query of the form "select * from table where MBRIntersects(..A..) or MBRIntersects(..B..)" would return fewer results than the initial part of the query (ie "select * from table where MBRIntersects(..A..)"). This has how been fixed, but now the query runs dog slowly. Basically explain indicates why... the spatial index is not used.

Examples:

1. Fine

explain SELECT _id FROM hydro WHERE MBRIntersects(SHAPE,GeomFromText('POLYGON((115.362236 -33.82212,115.36312 -33.82212,115.36312 -33.787,115.362236 -33.787,115.362236 -33.82212))', 4326 ))

possible_keys: SHAPE
          key: SHAPE

2. Fine

explain SELECT _id FROM hydro WHERE MBRIntersects(SHAPE,GeomFromText('POLYGON((115.362236 -33.82212,115.36312 -33.82212,115.36312 -33.787,115.362236 -33.787,115.362236 -33.82212))', 4326 )) or _id=10

possible_keys: PRIMARY,SHAPE
          key: SHAPE,PRIMARY

3. Unhappy

explain SELECT _id FROM hydro WHERE ((MBRIntersects(SHAPE,GeomFromText('POLYGON((115.30215 -33.822117,115.36253 -33.822117,115.36253 -33.805443,115.30215 -33.805443,115.30215 -33.822117))', 4326 )))) or MBRIntersects(SHAPE,GeomFromText('POLYGON((115.362236 -33.82212,115.36312 -33.82212,115.36312 -33.787,115.362236 -33.787,115.362236 -33.82212))', 4326 ))

possible_keys: SHAPE
          key: NULL

There is no real solution to this. When I query my table (a couple of Gigs) it takes about 0.1 second to perform with one part of the where clause (reduced to 0.00 on repeat queries), but over a minute with the two parts.

I can obviously run two queries and join the results myself, but this sort of defeats the purpose of using a database (especially since the real queries contain far more parts, joins, etc).

+----+-------------+-------+-------+---------------+-------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key   | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+-------+---------+------+------+-------------+
|  1 | SIMPLE      | hydro | range | SHAPE         | SHAPE | 32      | NULL |    1 | Using where | 
+----+-------------+-------+-------+---------------+-------+---------+------+------+-------------+

How to repeat:
Can be repeated by running this type of query on large tables. For testing, create a table

CREATE TABLE test (
  _id int(11) NOT NULL auto_increment,
  shape geometry NOT NULL,
  PRIMARY KEY  (_id),
  SPATIAL KEY shape (SHAPE(32))
);

insert into test (shape) values (geomfromtext('LINESTRING(115 -33, 120 -34)'));

explain SELECT _id FROM test WHERE ((MBRIntersects(SHAPE,GeomFromText('POLYGON((115.30215 -33.822117,115.36253 -33.822117,115.36253 -33.805443,115.30215 -33.805443,115.30215 -33.822117))', 4326 ))));

Results:

possible_keys: shape
          key: NULL

In the fix it would use a key (ie non NULL)

Suggested fix:
Improve the optimiser to support multiple key usage with OR clauses, as is done between btree and rtree keys and in other cases.
[18 Apr 2007 8:08] Wil Williams
Sorry, I rushed the repeat part. You need to insert a significant amount of data.
If you just repeat the insert statement 100 times (or less) it will work:

Then the query is

explain SELECT _id FROM test WHERE ((MBRIntersects(SHAPE,GeomFromText('POLYGON((115.30215 -33.822117,115.36253 -33.822117,115.36253 -33.805443,115.30215 -33.805443,115.30215 -33.822117))', 4326 )))) or ((MBRIntersects(SHAPE,GeomFromText('POLYGON((115.30215 -33.822117,115.36253 -33.822117,115.36253 -33.805443,115.30215 -33.805443,115.30215 -33.822117))', 4326 ))));

which will not use a key (you need the two parts).

----------
Also: A new work around. I can precalculate the envelope of the query parts and form a query of the form

"select * from table where MBRIntersects(BIG_RECT) and (old query part)"

This works. However it is much less efficient and relies on the rectangles being close. It is however a work around.

---
PS: I've found a new bug when not placing enough brackets in an AND/OR query:

Works:

explain SELECT _id FROM hydro WHERE ((MBRIntersects(SHAPE,GeomFromText('POLYGON((115.30215 -33.822117,115.36253 -33.822117,115.36253 -33.805443,115.30215 -33.805443,115.30215 -33.822117))', 4326 )))) and (((MBRIntersects(SHAPE,GeomFromText('POLYGON((115.30215 -33.822117,115.36253 -33.822117,115.36253 -33.805443,115.30215 -33.805443,115.30215 -33.822117))', 4326 )))) or ((MBRIntersects(SHAPE,GeomFromText('POLYGON((115.30215 -33.822117,115.36253 -33.822117,115.36253 -33.805443,115.30215 -33.805443,115.30215 -33.822117))', 4326 )))))

Crashes connection:
explain SELECT _id FROM hydro WHERE (MBRIntersects(SHAPE,GeomFromText('POLYGON((115.30215 -33.822117,115.36253 -33.822117,115.36253 -33.805443,115.30215 -33.805443,115.30215 -33.822117))', 4326 ))) AND ((MBRIntersects(SHAPE,GeomFromText('POLYGON((115.30215 -33.822117,115.36253 -33.822117,115.36253 -33.805443,115.30215 -33.805443,115.30215 -33.822117))', 4326 )))) or MBRIntersects(SHAPE,GeomFromText('POLYGON((115.362236 -33.82212,115.36312 -33.82212,115.36312 -33.787,115.362236 -33.787,115.362236 -33.82212))', 4326 ))

ERROR 2013 (HY000): Lost connection to MySQL server during query
[18 Apr 2007 11:00] Valeriy Kravchuk
Thank you for a problem report. The fact that index is not used in case of OR clause is verified with latest 5.0.40-BK on Linux. 

As for a crash, please, report it separately with complete test case.
[23 Apr 2007 13:15] Georgi Kodinov
Currently MySQL doesn't support merging multiple ranges in "range" access method for a spatial index (see Section 7.2.5.1 of the Reference manual for details: http://dev.mysql.com/doc/refman/5.0/en/range-access-single-part.html) . This is why you get index access when you have only one spatial predicate and not if you have two (or more).
The server can be fixed to recognize that both predicates are identical and ignore one of them (as in your steps to reproduce), but this will IMHO constitute a very limited improvement. If your bug is about this specific case, please re-open it and state that you want this specific case fixed.
To work around this limitation you can use UNION with identical statements and put each spatial predicate on a different branch of the UNION.
[13 May 2015 8:08] Norvald Ryeng
This is a duplicate of bug #45776.