Bug #38793 NDBAPI : EQ never sent for Index scan bounds
Submitted: 14 Aug 2008 11:45 Modified: 17 Mar 2009 20:12
Reporter: Frazer Clement Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: NDB API Severity:S5 (Performance)
Version:mysql-5.1-telco-6.2 OS:Any
Assigned to: Frazer Clement CPU Architecture:Any

[14 Aug 2008 11:45] Frazer Clement
Description:
The NdbRecord setBound() API takes an IndexBound structure describing a range comprising lower and upper bounds.
Each bound is described by a row pointer, number of significant columns and an inclusive bit.

An equality (EQ) bound is specified in this structure by :
 - Providing lower and upper bounds pointing to rows with the same values for the significant columns (or pointing to the same row)
 - Providing the same number of significant columns for lower and upper bounds
 - Setting the inclusive bit for lower and upper bounds.

When building the KEYINFO for the SCANTABREQ, the two bounds are transformed into KEYINFO independently, and the EQ boundtype is never sent.

This means that for EQ bounds, twice as much KEYINFO is sent as is necessary.

How to repeat:
Recompile MySQLD with API_TRACE defined
Set API_SIGNAL_LOG to STDOUT

create table scanable (a int primary key, b int) engine=ndb;
alter table scanable add index (b);

insert into scanable values (1,1), (2,1), (3,1), (4,2), (5,2), (6,2), (7,3), (8,4);

mysql> select * from scanable where b=2;
+---+------+
| a | b    |
+---+------+
| 6 |    2 | 
| 4 |    2 | 
| 5 |    2 | 
+---+------+
3 rows in set (0.01 sec)

mysql> 

Trace for SCANTABREQ shows LE and GE bound being sent, both with same KEYINFO : 

---- Send ----- Signal ----------------
r.bn: 245 "DBTC", r.proc: 3, gsn: 32 "SCAN_TABREQ" prio: 1
s.bn: 32773 "API", s.proc: 4, s.sigId: 0 length: 11 trace: 1 #sec: 3 fragInf: 0
 apiConnectPtr: H'00000021 requestInfo: H'00408a02:
  Parallellism: 2 Batch: 64 LockMode: 0 Keyinfo: 0 Holdlock: 0 RangeScan: 1 Descending: 0 TupScan: 0
 ReadCommitted: 1 DistributionKeyFlag: 0 NoDisk: 1 attrLen: 0, keyLen: 0 tableId: 26, tableSchemaVer: 1
 transId(1, 2): (H'0000002d, H'00500400) storedProcId: H'0000ffff
 batch_byte_size: 32768, first_batch_size: 64
 H'00000021 H'00000000 H'00408a02 H'0000001a H'00000001 H'0000ffff H'0000002d
 H'00500400 H'00000020 H'00008000 H'00000040
SECTION 0 type=generic size=2
 H'0000001c H'00000020
SECTION 1 type=generic size=11
 H'00000001 H'00000005 H'00000000 H'00000000 H'00000000 H'fff00002 H'00041017
 H'00010004 H'00000002 H'00000012 H'03830013
SECTION 2 type=generic size=6
 H'00060000 H'00000004 H'00000002 H'00000002 H'00000004 H'00000002

(SECTION 2)

Ideally SECTION 2 should have a single EQ bound, looking something like : 

SECTION 2 type=generic size=3
 H'00030004 H'00000004 H'00000002

Messaging savings would be more significant for longer keys.

Suggested fix:
Modify NdbScanOperation::setBound() to check for the EQ case, and only send a single EQ bound when it is detected.

I suspect that MySQLD generally would pass an EQ bound where the same row* is passed for lower and upper bounds.  This should allow a quick check for EQ in most cases (i.e. check that (row ptrs are equal) && (key counts are equal) && (both are inclusive).

If the row ptrs are not equal, the keys can be bytewise compared, as is done for the partition pruning.
[10 Mar 2009 17:06] Frazer Clement
Potential patch for bug

Attachment: idx_scan_eq_2.patch (text/x-patch), 3.02 KiB.

[10 Mar 2009 17:15] Frazer Clement
Proposed patch :
 - Compares low and high key bound ptrs, counts and inclusivity bits for equality.
 - If they are equal, sends all column bounds with BoundEQ type, only once.

Written against 6.3, could be applied to 6.2+

Benefits :
 - Low overhead to test equality
 - No API changes required
 - Covers 'main' use case of EQ on a set of key columns.

Limitations :
 - Column values themselves are not compared for equality
   - If ptrs are unequal, but values are the same, separate lower and upper values will be sent
   - If key counts are unequal, but values are the same, separate lower and upper values will be sent.

Potential extensions :
 - If there's a compelling use case to send EQ bounds for cases not covered above, some new API can be added to supply :
   - An 'eq_prefix_length' indicating how many leading columns are equal
     : Seems like next-most-useful use case for using EQ?
   - An 'eq_columns_mask' indicating which columns are equal
     : More general purpose than above, harder to use.
[10 Mar 2009 17:19] Jonas Oreland
i think patch is ok,
i would move "bool isEqBound" to where it gets it's value,
i.e const bool isEqBound

i limitations and extensions can be left for later...
but it sounds like a good idea documenting them, 
somewhere, maybe in header or code
[17 Mar 2009 10:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/69402

2880 Frazer Clement	2009-03-17
      Bug#38793 NDBAPI : EQ never sent for Index scan bounds
      
      NdbRecord ordered index scans currently express an EQ range as separate
      lower and upper bounds.  This results in 2 copies of EQ column values
      being sent to the kernel which is inefficient.
      
      This patch detects when an equal range is specified by comparing the
      passed pointers, key lengths and inclusive bits and sending only one
      copy of the equal key columns.  This halves the amount of KeyInfo 
      sent for an EQ range.
      modified:
        storage/ndb/src/ndbapi/NdbScanOperation.cpp
[17 Mar 2009 11:09] Bugs System
Pushed into 5.1.32-ndb-7.0.4 (revid:frazer@mysql.com-20090317110610-j98iyp964fo60z9i) (version source revid:frazer@mysql.com-20090317110610-j98iyp964fo60z9i) (merge vers: 5.1.32-ndb-7.0.4) (pib:6)
[17 Mar 2009 11:09] Bugs System
Pushed into 5.1.32-ndb-6.3.24 (revid:frazer@mysql.com-20090317105107-mazbi3jc552xcqfe) (version source revid:frazer@mysql.com-20090317105107-mazbi3jc552xcqfe) (merge vers: 5.1.32-ndb-6.3.24) (pib:6)
[17 Mar 2009 11:11] Bugs System
Pushed into 5.1.32-ndb-6.2.18 (revid:frazer@mysql.com-20090317104838-3vcb6kalv8j0ua3d) (version source revid:frazer@mysql.com-20090317104838-3vcb6kalv8j0ua3d) (merge vers: 5.1.32-ndb-6.2.18) (pib:6)
[17 Mar 2009 20:12] Jon Stephens
Documented bugfix in the NDB-6.2.18, 6.3.24, and 7.0.4 changelogs as follows:

        Ordered index scans using NdbRecord formerly expressed an EQ
        range as separate lower and upper bounds, resulting in 2 copies
        of EQ column values being sent to the NDB kernel.

        Now, when a range is specified by NdbScanOperation::setBound(),
        the passed pointers, key lengths, and inclusive bits are
        compared, and only one copy of the equal key columns is sent to
        the kernel. This makes makes such operation more efficient, as
        half the amount of KeyInfo is now sent for an EQ range as before.