Bug #46144 Falcon indexes may become corrupt
Submitted: 13 Jul 2009 10:32 Modified: 26 May 2010 17:53
Reporter: Lars-Erik Bjørk Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:falcon-team OS:Any
Assigned to: Lars-Erik Bjørk CPU Architecture:Any

[13 Jul 2009 10:32] Lars-Erik Bjørk
Description:
As part of stabilizing the LIMIT feature (bug#42915 "Falcon "umbrella"
bug for LIMIT issues in PB2 RQG tests"), I have been working on the
following two bugs:

bug#44783 "A SELECT using ORDER BY and LIMIT sometimes returns too many rows" 
bug#45086 "A SELECT using ORDER BY and LIMIT sometimes misses one row"

This bug is opened to better track the real reason for these two
bugs, as the problem is not related to the LIMIT feature.

After fixing many errors in how we traverse the index structures
during a LIMIT query, we still see discrepancies in the number of rows
returned from a query. In every occurrence of these two bugs that I
have analyzed, the error is however not in how we traverse the indexes
during a LIMIT query, but in the actual index structures. There are
out of order duplicate nodes, and sometimes there is a node
missing. 

The duplicate nodes are only visible when using LIMIT, whereas the
missing node may also be seen during a regular query. The fact that
the index is corrupt is however not related to the LIMIT feature.

I have added debug printouts in IndexWalker::getValidateRecord, which
in this case is called for every leaf node in the index. The output is
piped through a Perl script to make it easier to extract the useful
information.

Here is an example of a run of bug#44783, the results changes for
every run of the test:

SELECT * FROM `E` WHERE `int_key` < 192 returns different result when
executed with predicate 'ORDER BY `int_key` LIMIT 1073741824' (1000
vs. 1129 rows)

129 rows too many are returned by the query. The Perl output looks
like this.

-------------------

dup:  Number of records recognized as duplicates
nrec: Number of nodes with no corresponding record
ncan: Number of nodes with no corresponding candidate record
nmis: Number of nodes who's key do not match the record value
nret: Number of nodes who pass as valid nodes and refer to a valid record

Page:137,  dup:0,  nrec:0,   ncan0,   nmis::1,       nret:0         
Page:146,  dup:0,  nrec:0,   ncan0,   nmis::1025,    nret:0         
Page:234,  dup:0,  nrec:0,   ncan0,   nmis::974,     nret:0         
Page:140,  dup:0,  nrec:0,   ncan0,   nmis::642,     nret:0         
Page:136,  dup:0,  nrec:0,   ncan0,   nmis::1001,    nret:0         
Page:156,  dup:0,  nrec:0,   ncan0,   nmis::519,     nret:0         
Page:199,  dup:0,  nrec:0,   ncan0,   nmis::511,     nret:0         
Page:173,  dup:0,  nrec:0,   ncan0,   nmis::512,     nret:0         
Page:175,  dup:0,  nrec:0,   ncan0,   nmis::803,     nret:0         
Page:138,  dup:0,  nrec:0,   ncan0,   nmis::501,     nret:0         
Page:212,  dup:0,  nrec:0,   ncan0,   nmis::155,     nret:0         
Page:147,  dup:0,  nrec:0,   ncan0,   nmis::112,     nret:0         
Page:215,  dup:0,  nrec:0,   ncan0,   nmis::1001,    nret:0         
Page:216,  dup:0,  nrec:0,   ncan0,   nmis::519,     nret:0         
Page:241,  dup:0,  nrec:0,   ncan0,   nmis::892,     nret:129   (valid)
Page:213,  dup:0,  nrec:0,   ncan0,   nmis::0,       nret:0     (empty)
Page:150,  dup:0,  nrec:0,   ncan0,   nmis::0,       nret:706   (valid)
Page:202,  dup:0,  nrec:0,   ncan0,   nmis::221,     nret:294   (valid)
Page:157,  dup:1,  nrec:0,   ncan0,   nmis::1000,    nret:0         
Page:164,  dup:1,  nrec:0,   ncan0,   nmis::778,     nret:0         
Page:186,  dup:0,  nrec:0,   ncan0,   nmis::513,     nret:0         
Page:246,  dup:1,  nrec:0,   ncan0,   nmis::499,     nret:0         
Page:174,  dup:0,  nrec:0,   ncan0,   nmis::510,     nret:0         
Page:177,  dup:1,  nrec:0,   ncan0,   nmis::502,     nret:0         
Page:178,  dup:0,  nrec:0,   ncan0,   nmis::509,     nret:0         
Page:220,  dup:1,  nrec:0,   ncan0,   nmis::509,     nret:0         
Page:221,  dup:0,  nrec:0,   ncan0,   nmis::502,     nret:0         
Page:264,  dup:1,  nrec:0,   ncan0,   nmis::509,     nret:0         
Page:265,  dup:1,  nrec:0,   ncan0,   nmis::955,     nret:0         
Page:247,  dup:1,  nrec:0,   ncan0,   nmis::987,     nret:0         
Page:149,  dup:0,  nrec:0,   ncan0,   nmis::0,       nret:0     (empty)
Page:142,  dup:0,  nrec:0,   ncan0,   nmis::1,       nret:0         
Page:162,  dup:0,  nrec:0,   ncan0,   nmis::1,       nret:0         
Page:143,  dup:0,  nrec:0,   ncan0,   nmis::519,     nret:0         
Page:163,  dup:1,  nrec:0,   ncan0,   nmis::668,     nret:0         
Page:151,  dup:1,  nrec:0,   ncan0,   nmis::999,     nret:0         
Page:158,  dup:0,  nrec:0,   ncan0,   nmis::513,     nret:0         
Page:263,  dup:1,  nrec:0,   ncan0,   nmis::1005,    nret:0         
Page:171,  dup:1,  nrec:0,   ncan0,   nmis::546,     nret:0         
Page:139,  dup:0,  nrec:0,   ncan0,   nmis::0,       nret:0     (empty)
Page:165,  dup:0,  nrec:0,   ncan0,   nmis::513,     nret:0         
Page:182,  dup:1,  nrec:0,   ncan0,   nmis::616,     nret:0         
Page:166,  dup:0,  nrec:0,   ncan0,   nmis::503,     nret:0         
Page:160,  dup:1,  nrec:0,   ncan0,   nmis::518,     nret:0         
Page:161,  dup:0,  nrec:0,   ncan0,   nmis::503,     nret:0         
Page:249,  dup:1,  nrec:0,   ncan0,   nmis::516,     nret:0         
Page:181,  dup:1,  nrec:0,   ncan0,   nmis::1001,    nret:0         
Page:250,  dup:1,  nrec:0,   ncan0,   nmis::931,     nret:0         
Page:141,  dup:0,  nrec:0,   ncan0,   nmis::0,       nret:0     (empty)
Page:193,  dup:0,  nrec:0,   ncan0,   nmis::513,     nret:0         
Page:170,  dup:1,  nrec:0,   ncan0,   nmis::499,     nret:0         
Page:172,  dup:1,  nrec:0,   ncan0,   nmis::987,     nret:0         
Page:194,  dup:0,  nrec:0,   ncan0,   nmis::519,     nret:0         
Page:154,  dup:1,  nrec:0,   ncan0,   nmis::583,     nret:0         
Page:180,  dup:0,  nrec:0,   ncan0,   nmis::519,     nret:0         
Page:155,  dup:1,  nrec:0,   ncan0,   nmis::509,     nret:0         
Page:159,  dup:0,  nrec:0,   ncan0,   nmis::512,     nret:0         
Page:167,  dup:2,  nrec:0,   ncan0,   nmis::942,     nret:0         
Page:153,  dup:0,  nrec:0,   ncan0,   nmis::1029,    nret:0         
Page:169,  dup:1,  nrec:0,   ncan0,   nmis::988,     nret:0         
Page:168,  dup:1,  nrec:0,   ncan0,   nmis::497,     nret:0         

Total number of pages: 61
Number of empty pages: 4
Number of pages with valid entries: 3

-------------------

In this test, the indexed field has the same value for all records.
Pages 241, 150 and 202 contain "valid" nodes. Page 241 contains 129
nodes, and it is tempting to assume that these are the 129 duplicates.
In the run showed here, it is only an empty page that separates the
duplicate nodes from the valid nodes. But often there are several
pages, containing nodes with completely different key values.

Page 241 contains nodes for record numbers 0 plus [78-205], whereas
pages 150 and 202 combined contains nodes for record numbers [0-999].
Because the indexed field 'int_key' has the same value for all the
records, namely '9', the nodes should be sorted on record
number. This is not the case, not even for pages 150 and 202. Page
150 contains the nodes [0-514] and [809-999], whereas page 202
contains the missing nodes in the middle range, [515-808]. For some
reason the nodes appear out-of-order.

For bug#45086 we can see that a single node is missing, in the middle
of a page. To keep the length of this text down, I will not include an
example of this here.

Possible theories so far:
-------------------------

* Indexes are sorted incorrectly when merging deferred indexes into
  the main index
* A page is split twice by two different threads in a conflicting way
* A deferred index is corrupt
* Garbage collection does not work as intended

So far I have not come any closer to why this is happening

How to repeat:
As specified in the bug reports for

bug#44783 "A SELECT using ORDER BY and LIMIT sometimes returns too many rows" 
bug#45086 "A SELECT using ORDER BY and LIMIT sometimes misses one row"

these two scenarios are both reproduced by running a modified version
of falcon_nolimit_int.yy that looks like this:

query:
        select | dml | dml | dml | dml | dml ;

dml:
        update ;

select:
        SELECT * FROM _table where;

where:
        |
        WHERE _field sign value ;

sign:
        < ;

insert:
        INSERT INTO _table ( _field , _field ) VALUES ( value , value ) ;

update:
        UPDATE _table AS X SET _field = value where ;

value:
        _digit | _tinyint_unsigned ;

# Use only indexed fields:

_field:
        `int_key` ;

_table
        `E` ;

The test can be run with the following command from the gentest directory, given you have branched the mysql-test-extra-6.0 tree:

perl runall.pl \
--basedir=<path_to_your_basedir> \
--mysqld=--loose-innodb-lock-wait-timeout=1 \
--mysqld=--table-lock-wait-timeout=1 \
--mysqld=--loose-falcon-lock-wait-timeout=1 \
--mysqld=--loose-falcon-debug-mask=2 \
--mysqld=--skip-safemalloc \
--grammar=conf/falcon_nolimit_int.yy \
--threads=1 
--validator=Limit \ 
--reporters=Deadlock,ErrorLog,Backtrace,WinPackage \
--duration=1200 \
--vardir=/tmp/vardir \
--mysqld=--log-output=file \
--queries=100000 \
--engine=falcon