Bug #42136 Falcon: odd behavior for range query on index prefix when MRR is disabled
Submitted: 15 Jan 2009 16:53 Modified: 23 Apr 2009 1:20
Reporter: Sergey Petrunya Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: Vladislav Vaintroub CPU Architecture:Any
Tags: F_INDEX

[15 Jan 2009 16:53] Sergey Petrunya
Description:
Falcon exhibits odd behavior when running a range on an index prefix when MRR is disabled.

How to repeat:
1. Get the source data (1.3M) here:  http://s.petrunia.net/scratch/bug41890-dataset.sql.gz

2. Disable MRR. This is here because I've discovered this while analyzing BUG#41890. That bug reports an MRR problem, this bug is supposed to demonstrate that something is wrong without MRR, too.
 
   SET optimizer_use_mrr='disable'  

3. Put breakpoint in sql_select.cc:sub_select()

4. Run the query  (it's rather long so I'll attach it as a file)
  
   SELECT bench1.* FROM bench1 WHERE id IN (1,2,3,4,5,6, ... 1000) 

4. When you hut a breakpoint in sub_select(),  put another breakpoint at
   storage/falcon/Index.cpp:477 (line number from current 6.0-main), at the end of 
  Index::scanIndex(), on the

>>	if (transaction)   // here
		transaction->scanIndexCount++;
   
lines at the very bottom. Let the breakpoint print bitmap->count.

Then you're going to observe something like that:

(gdb) break "/home/spetrunia/dev/mysql-6.0-opt-look/storage/falcon/Index.cpp:477"
  Breakpoint 9 at 0x869f2c7: file Index.cpp, line 477.
(gdb) c
  Continuing.
  
  Breakpoint 9, Index::scanIndex (this=0xb6873780, lowKey=0xb6bbfc54, highKey=0xb6bc26a4, searchFlags=1, transaction=0xb685dce8, bitmap=0xb6858308) at Index.cpp:477
(gdb) p bitmap->count
  $151 = 1

## Good so far

(gdb) c
  Continuing.
  
  Breakpoint 9, Index::scanIndex (this=0xb6873780, lowKey=0xb6bbfc54, highKey=0xb6bc26a4, searchFlags=1, transaction=0xb685dce8, bitmap=0xb6858308) at Index.cpp:477
(gdb) p bitmap->count
  $152 = 131070

Now what's that? The bitmap has 131K elements. Considering that
- The query returns 1K rows 
- query's WHERE condition is satisfied for every record returned by the range scan, that is, the range scan really scans 1K rows. 

this is too much?

Suggested fix:
Is this a partial index scan problem?
[15 Jan 2009 17:02] Sergey Petrunya
The query

Attachment: bug42136-query.sql (, text), 3.88 KiB.

[15 Jan 2009 21:21] Kevin Lewis
Vlad, Can you verify if this is a problem in Falcon, or what is going on here?
[16 Jan 2009 13:54] Vladislav Vaintroub
I can verify it. Looks like only bitmap->count==13170 only on second breakpoint hit
[21 Jan 2009 20:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/63758

2965 Vladislav Vaintroub	2009-01-21
      Bug#42136: Falcon: odd behavior for range query on index prefix when MRR is
      disabled
      
      Problem: Falcon does not search efficiently for prefix queries (multi-
      segment indexes). 
      Index scanning  produces a lot of false positives, that are filtered 
      out in the post-processing step, which negatively affects performance.
      
      Analysis:
      In the bug description, keys are numeric and and Falcon is asked to retrieve
      all keys where the first segment corresponds to 2. Falcon would construct a 
      prefix 0x02c0 (0x02 is segment byte. 0xc0 is encoded 2) and use this prefix 
      for scanning.
      
      The problem here is that prefix is too short - keys build from integers 
      in the [2, 131071] range will all start with the same prefix 0x02c0.
      
      Solution:
      Since we are using multi-segment keys for search, we can construct a better 
      prefix. Taking into account rules for encoding of multi-segment key, we can 
      padd our key with 0x0 to the length that is multiple of 6, and add segment 
      byte for the next segment. So instead of searching for "0x02c0" prefix we will
      be be searching for "0x02c00000000001" which yields a much better result.
[21 Jan 2009 20:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/63759

2965 Vladislav Vaintroub	2009-01-21
      Bug#42136: Falcon: odd behavior for range query on index prefix when MRR is disabled 
      
      Problem:
      Falcon does not search efficiently for prefix queries (multi-segment indexes). Index scanning  produces a lot of false positives, that are filtered out in the post-processing step, which negatively affects performance.
      
      Analysis:
      In the bug description, keys are numeric and and Falcon is asked to retrieve all keys where the first segment corresponds to 2. Falcon would construct a prefix 0x02c0 (0x02 is segment byte. 0xc0 is encoded 2) and use this prefix for scanning. 
      But that  prefix is too short - keys build from integers in the [2, 131071] range will all start with the same prefix 0x02c0.
      
      Solution:
      Since we are using multi-segment keys for search, we can construct a better prefix. Taking into account rules for encoding of multi-segment key, we can padd our key with 0x0 to the length that is multiple of 6, and add segment  byte for the next segment. So instead of searching for "0x02c0" prefix we will be be searching for "0x02c00000000001" which yields a much better result.
[22 Jan 2009 17:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/63822

2965 Vladislav Vaintroub	2009-01-22
      Bug#42136: Falcon: odd behavior for range query on index prefix when MRR is disabled 
      
      Problem:
      Falcon does not search efficiently for prefix queries (multi-segment indexes). Index scanning  produces a lot of false positives, that are filtered out in the post-processing step, which negatively affects performance.
      
      Analysis:
      In the bug description, keys are numeric and and Falcon is asked to retrieve all keys where the first segment corresponds to 2. Falcon would construct a prefix 0x02c0 (0x02 is segment byte. 0xc0 is encoded 2) and use this prefix for scanning. 
      But that  prefix is too short - keys build from integers in the [2, 131071] range will all start with the same prefix 0x02c0.
      
      Solution:
      Since we are using multi-segment keys for search, we can construct a better prefix. Taking into account rules for encoding of multi-segment key, we can padd our key with 0x0 to the length that is multiple of 6, and add segment  byte for the next segment. So instead of searching for "0x02c0" prefix we will be be searching for "0x02c00000000001" which yields a much better result.
[22 Jan 2009 18:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/63829

2965 Vladislav Vaintroub	2009-01-22
      Bug#42136: Falcon: odd behavior for range query on index prefix when MRR is disabled 
      
      Problem:
      Falcon does not search efficiently for prefix queries (multi-segment indexes). Index scanning  produces a lot of false positives, that are filtered out in the post-processing step, which negatively affects performance.
      
      Analysis:
      In the bug description, keys are numeric and and Falcon is asked to retrieve all keys where the first segment corresponds to 2. Falcon would construct a prefix 0x02c0 (0x02 is segment byte. 0xc0 is encoded 2) and use this prefix for scanning. 
      But that  prefix is too short - keys build from integers in the [2, 131071] range will all start with the same prefix 0x02c0.
      
      Solution:
      Since we are using multi-segment keys for search, we can construct a better prefix. Taking into account rules for encoding of multi-segment key, we can padd our key with 0x0 to the length that is multiple of 6, and add segment  byte for the next segment. So instead of searching for "0x02c0" prefix we will be be searching for "0x02c00000000001" which yields a much better result.
[13 Feb 2009 7:24] Bugs System
Pushed into 6.0.10-alpha (revid:alik@sun.com-20090211182317-uagkyj01fk30p1f8) (version source revid:christopher.powers@sun.com-20090122185219-mwcqul7nbx6krmwr) (merge vers: 6.0.10-alpha) (pib:6)
[23 Apr 2009 1:13] Paul DuBois
This fix also fixes Bug#41890.
[23 Apr 2009 1:20] Paul DuBois
Noted in 6.0.10 changelog.

For Falcon tables, range queries using an index prefix were slow when
Multi-Range Read index scans were disabled.