Bug #33257 range access on partitioned falcon table actually scans entire indexes
Submitted: 15 Dec 2007 8:38 Modified: 14 Mar 2008 10:43
Reporter: Sergey Petrunya
Status: Closed
Category:Server: Partition Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: Bugs System Target Version:
Triage: D3 (Medium)

[15 Dec 2007 8:38] Sergey Petrunya
Description:
When "range" access method is used on partitioned Falcon table it will actually scan the
entire index.

How to repeat:
##
## Create the dataset:
##
create table t10 (a int);
insert into t10 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); 
create table t11 (a int , b int, key(a)) engine=falcon partition by key(b) partitions 1;

insert into t11 select A.a + 10 *B.a X, 0 from t10 A, t10 B order by X desc; 

##
## Now get a query that will use range access 
##
explain select * from t11 where a < 3;
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra
      |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | t11   | range | a             | a    | 5       | NULL |   33 | Using
where | 
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
1 row in set (3.65 sec)

## Put a breakpoint at "bitmap->set (number);" call at IndexRootPage::scanIndex():

(gdb) commands 1
  Type commands for when breakpoint 9 is hit, one per line.
  End with a line saying just "end".
p number
cont 
end

## Run the query:
mysql>
  select * from t11 where a < 3;

## And see it to walk through the entire index (here we have 100 records):

  Breakpoint 9, IndexRootPage::scanIndex (dbb=0xb7097cfc, indexId=1, rootPage=16,
lowKey=0x8c97040, highKey=0x0, searchFlags=1, transId=0, bitmap=0xb709d370) at
IndexRootPage.cpp:465
  $72 = 99
  
  Breakpoint 9, IndexRootPage::scanIndex (dbb=0xb7097cfc, indexId=1, rootPage=16,
lowKey=0x8c97040, highKey=0x0, searchFlags=1, transId=0, bitmap=0xb709d370) at
IndexRootPage.cpp:465
  $73 = 98
  
  Breakpoint 9, IndexRootPage::scanIndex (dbb=0xb7097cfc, indexId=1, rootPage=16,
lowKey=0x8c97040, highKey=0x0, searchFlags=1, transId=0, bitmap=0xb709d370) at
IndexRootPage.cpp:465
  $74 = 97
...
  Breakpoint 9, IndexRootPage::scanIndex (dbb=0xb7097cfc, indexId=1, rootPage=16,
lowKey=0x8c97040, highKey=0x0, searchFlags=1, transId=0, bitmap=0xb709d370) at
IndexRootPage.cpp:465
  $171 = 0
  

Suggested fix:
The problem is that ha_partition translates handler->read_range_first/next() calls into
index_read/ index_next[_same()] calls. AFAIU one cannot do that if the storage engine
doesnt guarantee index-ordered retrieval - you wouldn't know if you can stop scanning the
index before you've reached its end.

ha_partition should pass  read_range_first/next calls to underlying engines.
[22 Dec 2007 9:00] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40373

ChangeSet@1.2766, 2007-12-22 10:59:24+03:00, sergefp@mysql.com +4 -0
  BUG#33257 "range access on partitioned falcon table actually scans entire indexes"
  - Make ha_partition to pass read_range_XXX() calls to partition handlers, without
converting 
    them to index_read/index_next-family calls. 
  
    This is needed for Falcon (which is terribly inefficient when it doesn't get the
other 
     endpoint) and possibly other storage engines with similar characteristics.
[12 Jan 2008 0:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40948

ChangeSet@1.2766, 2008-01-12 02:03:55+03:00, sergefp@mysql.com +4 -0
  BUG#33257 "range access on partitioned falcon table actually scans entire indexes"
  - Make ha_partition to pass read_range_XXX() calls to partition handlers, without
converting 
    them to index_read/index_next-family calls. 
  
    This is needed for Falcon (which is terribly inefficient when it doesn't get the
other 
     endpoint) and possibly other storage engines with similar characteristics.
  
  This patch also fixes BUG#30573 (bk trigger: mark as fix for BUG#33257)
[24 Jan 2008 0:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/41180

ChangeSet@1.2790, 2008-01-24 02:08:03+03:00, sergefp@mysql.com +1 -0
  BUG#33257 "range access on partitioned falcon table actually scans entire indexes"
  - pushbuild fix: when calling $partition_handler->read_range_first() don't 
    pass the start key if we really didn't have it, pass NULL, as goes per call
    convention.
[13 Mar 2008 20:29] Bugs System
Pushed into 6.0.5-alpha
[14 Mar 2008 10:43] Jon Stephens
Documented in the 6.0.5 changelog as follows:

        When the range access method was used on a
        partitioned Falcon table, the entire index was
        scanned. For partitioned tables using other storage engines, a related
        issue caused an ordered range scan to return some rows twice.
[18 Sep 2008 17:06] Mattias Jonsson
Bug#33555 and Bug#30573 are duplicates of this, resulting in this patch will be backported
to 5.1 (handled in bug#33555)
[18 Sep 2008 17:30] Sergey Petrunya
Mattias, this bug and its fix are about Falcon tables. Falcon is not present in 5.1. What
exactly are you going to backport?