Bug #21104 | R-Tree Index Incorrectly Optimized When Table Has One Row | ||
---|---|---|---|
Submitted: | 18 Jul 2006 2:05 | Modified: | 8 Sep 2006 8:25 |
Reporter: | Dean Swift | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 5.0.25-BK, 5.0.18 | OS: | Linux (Linux) |
Assigned to: | Georgi Kodinov | CPU Architecture: | Any |
Tags: | gis, opengis, Optimizer, r-tree, r-trees, rtree, rtrees, where |
[18 Jul 2006 2:05]
Dean Swift
[18 Jul 2006 16:13]
Valeriy Kravchuk
Thank you for a bug report. Verified just as described with 5.0.25-BK on Linux. Test case: CREATE TABLE `date2` ( `id` INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, `epoch_begin` DATETIME, `epoch_end` DATETIME, `epoch` GEOMETRY NOT NULL); CREATE SPATIAL INDEX `epoch` ON `date2` (`epoch`); EXPLAIN SELECT `id`, `epoch_begin`, `epoch_end`, AsText(`epoch`) AS 'epoch' FROM `date2` WHERE MBROverlaps(GeomFromText(CONCAT('Polygon((0 ', UNIX_TIMESTAMP('2006-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2006-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2006-01-01 03:00:00'), ',0 ', UNIX_TIMESTAMP('2006-01-01 03:00:00'), ',0 ', UNIX_TIMESTAMP('2006-01-01 00:00:00'),'))')), `epoch`)\G INSERT INTO `date2` (`id`,`epoch_begin`,`epoch_end`,`epoch`) VALUES (NULL,'2005-01-01 00:00:00','2007-01-01 00:00:00', GeomFromText(CONCAT('Polygon((0 ',UNIX_TIMESTAMP('2005-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2005-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2007-01-01 00:00:00'), ',0 ', UNIX_TIMESTAMP('2007-01-01 00:00:00'), ',0 ', UNIX_TIMESTAMP('2005-01-01 00:00:00'),'))'))); EXPLAIN SELECT `id`, `epoch_begin`, `epoch_end`, AsText(`epoch`) AS 'epoch' FROM `date2` WHERE MBROverlaps(GeomFromText(CONCAT('Polygon((0 ', UNIX_TIMESTAMP('2006-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2006-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2006-01-01 03:00:00'), ',0 ', UNIX_TIMESTAMP('2006-01-01 03:00:00'), ',0 ', UNIX_TIMESTAMP('2006-01-01 00:00:00'),'))')), `epoch`)\G INSERT INTO `date2` (`id`,`epoch_begin`,`epoch_end`,`epoch`) VALUES (NULL,'2005-01-01 00:00:00','2007-01-01 00:00:00', GeomFromText(CONCAT('Polygon((0 ',UNIX_TIMESTAMP('2005-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2005-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2007-01-01 00:00:00'), ',0 ', UNIX_TIMESTAMP('2007-01-01 00:00:00'), ',0 ', UNIX_TIMESTAMP('2005-01-01 00:00:00'),'))'))); EXPLAIN SELECT `id`, `epoch_begin`, `epoch_end`, AsText(`epoch`) AS 'epoch' FROM `date2` WHERE MBROverlaps(GeomFromText(CONCAT('Polygon((0 ', UNIX_TIMESTAMP('2006-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2006-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2006-01-01 03:00:00'), ',0 ', UNIX_TIMESTAMP('2006-01-01 03:00:00'), ',0 ', UNIX_TIMESTAMP('2006-01-01 00:00:00'),'))')), `epoch`)\G
[4 Sep 2006 13:00]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/11354 ChangeSet@1.2255, 2006-09-04 16:00:24+03:00, gkodinov@macbook.gmz +3 -0 Bug #21104: R-Tree Index Incorrectly Optimized When Table Has One Row Consider the points that lay on the Minimum Bounding Rectangle as inner points to that rectange, e.g.: (2 1) is inner to (1 1, 1 3, 3 3, 3 1, 1 1)
[4 Sep 2006 13:18]
Georgi Kodinov
There is a bug shown by this error report, but it has nothing to do with the R-tree or the EXPLAIN. What EXPLAIN does if it knows that there is one row in the table is that it executes the condition using the values in that single row at compile time and optimizes accordingly. Same happens if there are no rows : then the column values when calculating the conditions will be null. So "Impossible WHERE" is just a shortcut. It looks like it goes away when the second row is inserted, but it is still there (resulting in no rows) - it's just not explicit anymore. Now let's consider what is being calculated to get an "Impossible WHERE" when the table has only one row. Basically this is the expression : select MBROverlaps( GeomFromText( CONCAT( 'Polygon((0',UNIX_TIMESTAMP('2006-01-01 00:00:00'), ',1 ',UNIX_TIMESTAMP('2006-01-01 00:00:00'), ',1 ',UNIX_TIMESTAMP('2006-01-01 03:00:00'), ',0 ',UNIX_TIMESTAMP('2006-01-01 03:00:00'), ',0 ',UNIX_TIMESTAMP('2006-01-01 00:00:00'), '))' ) ), GeomFromText( CONCAT( 'Polygon((0 ',UNIX_TIMESTAMP('2005-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2005-01-01 00:00:00'), ',1 ', UNIX_TIMESTAMP('2007-01-01 00:00:00'), ',0 ', UNIX_TIMESTAMP('2007-01-01 00:00:00'), ',0 ', UNIX_TIMESTAMP('2005-01-01 00:00:00'), '))' ) ) ); Let's simplify it even more by substituting : - UNIX_TIMESTAMP('2005-01-01 00:00:00') with 1 - UNIX_TIMESTAMP('2006-01-01 00:00:00') with 2 - UNIX_TIMESTAMP('2006-01-01 03:00:00') with 3 - UNIX_TIMESTAMP('2007-01-01 00:00:00') with 4 and then taking away the CONCAT() (as we now have string constants only). Here's the simplified statement : select MBROverlaps( GeomFromText('Polygon((0 2, 1 2, 1 3, 0 3, 0 2))'), GeomFromText('Polygon((0 1, 1 1, 1 4, 0 4, 0 1))') ); Now the error is evident : the problem is that the first rectangle (the smaller one) is not considered as overlapping with the second (the bigger one). This is because of a bug in MBROverlaps () (actually in the MBR::inner_point()) : a point from the minimum bounding rectangle is not considered as inner to that rectangle.
[8 Sep 2006 8:25]
Alexey Botchkov
I don't see anything wrong here. Rectangles in this example (both original and 'simplified' couples) are NOT overlapping. Accordingly to the OpenGIS standard, if one object is completely inside another, Overlaps returns FALSE. More precisely, intersection of two objects should not be equal to any of them. Did you want to use MBRIntersects() instead of Overlaps?