Bug #44795 Falcon: performance regression for range queries
Submitted: 11 May 2009 15:57 Modified: 26 May 2010 17:53
Reporter: Alexey Stroganov Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:6.0.11 OS:Any
Assigned to: CPU Architecture:Any

[11 May 2009 15:57] Alexey Stroganov
Description:
While analyzing results of mysql-bench test for 6.0.11 I observed notable drop in performance (10-40%) for numerous select queries. Further analysis showed that most of the affected queries are range ones. The very similar bug was fixed in 6.0.10(See initial report for 6.0.9 - BUG#41980 and isolated issue by Sergey Petrunia BUG#42136) and it looks that issue re-appears again. 

some results with test-case from bug#41890:

perl bug_select_mrr_41980.pl --db-socket=/tmp/609.sock --subselect_in
MySQL 6.0.9 alpha, ENGINE: Falcon
Inner table size: 10
optimizer_use_mrr=force

QUERY:SELECT * FROM bench1 WHERE id IN (SELECT id FROM bench2)
Time for subselect_in (100:1000): 9.46644 wallclock secs 

perl bug_select_mrr_41980.pl --db-socket=/tmp/6011.sock --subselect_in
MySQL 6.0.11 alpha, ENGINE: Falcon
Inner table size: 10
optimizer_use_mrr=force

QUERY:SELECT * FROM bench1 WHERE id IN (SELECT id FROM bench2)
Time for subselect_in (100:1000): 12.2152 wallclock secs 

How to repeat:
1. Start server 
2. run test script from BUG#41890(bug_select_mrr_41980.pl) as following to create a table and populate it with data 

perl bug_select_mrr_41980.pl --sb-socket=<socket> --create

3. run test case with any set of queries: 
subselect_in, select_join_in, select, select_in, subselect_exists

perl bug_select_mrr_41980.pl --sb-socket=<socket> --subselect_in
[11 May 2009 16:26] Vladislav Vaintroub
The reason for the issue is basically the same as in BUG#42136 (checked with the debugger). Unfortunately it came with the introduction of the new multisegment index format.

New multisegment format creates keys as following
<part1>0x00<part2>0x00...<partN>

When searching for partially defined key with a single part only, Falcon would create the key in <part1> form and do a partial search.

If part1 is short and has common prefix,e.g 0x02c0 (see comment in fix for  Bug#42136), there are multiple false positives found.

To fix, one would need to change partial search logic a little bit. If one searches for partial key in multisegment index, one needs to search for 
- Either exact match for <part1>.This case corresponds to index entry where all other after first one are NULLs. 
- Prefix match for <part1>0x00. This case corresponds to index entry where there is at least one component after first that is not NULL.

Practically it could be solved with by e.g either doing 2 exact searches for <part1> and <part1>0x00 and merging resulting bitmaps

Alternatively searching for range
low = <part1>, high=<part1>0x00ffffff......ff (length of the high key is max allowed index length) *without* partial flag would do the trick.

Alternatively, there might be another strategy like defining PartialMultisegment key that finds all the entries as described above and no false positives.
[11 May 2009 16:31] Vladislav Vaintroub
Correction for previos comment:

"Practically it could be solved with by e.g either doing 2 exact searches for <part1> and <part1>0x00 and merging resulting bitmaps".

This should be read as
Practically it could be solved with by e.g either doing 2 searches:
a) exact match for <part1> and b) prefix match for <part1>0x00 
and merging resulting bitmaps.
[12 May 2009 10:26] Vladislav Vaintroub
I think I got even better idea for constructing low/high boundaries for exact match in multisegment keys where only some index components are defined:

- create lowKey as usual
- create highKey as <lowKey>0x01.
- search in (lowKey, highKey) range _not_ using the Partial flag

This will find exactly the required entries - exact lowKey if present plus <lowKey>0x00<possibleTrailingPart>.
[12 May 2009 10:26] Vladislav Vaintroub
I think I got even better idea for constructing low/high boundaries for exact match in multisegment keys where only some index components are defined:

- create lowKey as usual
- create highKey as <lowKey>0x01.
- search in (lowKey, highKey) range _not_ using the Partial flag

This will find exactly the required entries - exact lowKey if present plus <lowKey>0x00<possibleTrailingPart>.