Bug #45776 "OR" condition on SPATIAL index not optimized
Submitted: 26 Jun 2009 7:17 Modified: 26 Jun 2009 7:18
Reporter: Hartmut Holzgraefe Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:mysql-5.1.35 OS:Linux
Assigned to: CPU Architecture:Any

[26 Jun 2009 7:17] Hartmut Holzgraefe
Description:
Doing a MBRcontains() query on a indexed GEOMETRY column uses the SPATIAL index just fine, two MBRcontains() conditions combined with OR lead to a full table scan though.

On a test table with a copy of the GEOMETRY data in a 2nd column with an index of its own a query which does a MBRcontains(col1, ...) OR MBRcontains(col2, ...) index_merge kicks in just fine, but this only works when merging from *different* indexes.

An OR condition on a single SPATIAL index needs to be rewritten as a UNION DISTINCT to perform well instead of doing a full table scan.

How to repeat:
On a large test table 

  CREATE TABLE `t2` (
    `id` int(11) NOT NULL auto_increment,
    `geo` geometry NOT NULL,
    `geo2` geometry NOT NULL,
    PRIMARY KEY  (`id`),
    SPATIAL KEY `geo` (`geo`),
    SPATIAL KEY `geo2` (`geo2`)
  ) ENGINE=MyISAM;

with almost 5 million random rectangle POLYGONs in `geo`
and `geo2` being a copy of `geo` queries using a single
MBRcontains() expression use a SPATIAL index just fine:

  mysql> explain select count(*) from t2   
                  where mbrcontains(geo, geomfromtext('point(1000 1000)'))\G
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: t2
           type: range
  possible_keys: geo
            key: geo
        key_len: 34
            ref: NULL
           rows: 1
          Extra: Using where
  1 row in set (0.00 sec)

  mysql> explain select count(*) from t2 
                  where mbrcontains(geo, geomfromtext('point(2000 2000)'))\G
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: t2
           type: range
  possible_keys: geo
            key: geo
        key_len: 34
            ref: NULL
           rows: 1
          Extra: Using where  
  1 row in set (0.00 sec)

Combining both conditions in a single query with OR leads to a full table scan though:

  mysql> explain select count(*) from t2 
                  where mbrcontains(geo, geomfromtext('point(1000 1000)')) 
                     or mbrcontains(geo, geomfromtext('point(2000 2000)'))\G
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: t2
           type: ALL
  possible_keys: geo
            key: NULL
        key_len: NULL
            ref: NULL
           rows: 4753401
          Extra: Using where
  1 row in set (0.00 sec)

When performing the 2nd MBRcontains() on the copy column `geo2` instead
index_merge saves the day:

  mysql> explain select count(*) from t2 
                  where mbrcontains(geo, geomfromtext('point(1000 1000)')) 
                     or mbrcontains(geo2, geomfromtext('point(2000 2000)'))\G
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: t2
           type: index_merge
  possible_keys: geo,geo2
            key: geo,geo2
        key_len: 34,34
            ref: NULL
           rows: 2
          Extra: Using sort_union(geo,geo2); Using where
  1 row in set (0.01 sec)

Another alternative is using UNION DISTINCT:

  mysql> explain select geo from t2 
                  where mbrcontains(geo, geomfromtext('point(1000 1000)'))  
           union distinct 
                 select geo from t2 
                  where mbrcontains(geo, geomfromtext('point(2000 2000)'))\G
  *************************** 1. row ***************************
             id: 1
    select_type: PRIMARY
          table: t2
           type: range
  possible_keys: geo
            key: geo
        key_len: 34
            ref: NULL
           rows: 1
          Extra: Using where
  *************************** 2. row ***************************
             id: 2
    select_type: UNION
          table: t2
           type: range
  possible_keys: geo
            key: geo
        key_len: 34
            ref: NULL
           rows: 1
          Extra: Using where
  *************************** 3. row ***************************
             id: NULL
    select_type: UNION RESULT
          table: <union1,2>
           type: ALL
  possible_keys: NULL
            key: NULL
        key_len: NULL
            ref: NULL
           rows: NULL
          Extra: 
  3 rows in set (0.03 sec)