Bug #35393 MySQL Cluster condition pushdown sends full length for VAR* types
Submitted: 18 Mar 2008 12:45 Modified: 2 Jul 2008 18:53
Reporter: Frazer Clement Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S5 (Performance)
Version:5.1 OS:Any
Assigned to: Frazer Clement CPU Architecture:Any

[18 Mar 2008 12:45] Frazer Clement
Description:
MySQL Cluster condition pushdown uses NdbScanFilter to describe conditions that rows must meet in terms of comparisons with column value constants.

Where the column value constants are of variable length type (VARCHAR, VARBINARY), the NDB handler passes a value which is padded to the maximum fieldlength.  Where the actual length of the VAR column is less than the maximum, this results in useless data being sent to the NDBD nodes.

This has the effect of wasted CPU in memcpy, wasting network bandwidth and, given that the interpreted program has a finite length, after which no condition pushdown is applied, it can result in condition pushdown being prematurely disabled.

How to repeat:
1 ) Create a table with a non-indexed large VARCHAR column (say 5000 bytes).
2 ) SELECT from the table with an IN clause listing a set of short values which this column's value must be in.
    - Check with EXPLAIN that condition pushdown is being used.
    - Maximum program size is 200000 bytes.  This could be enough for an IN list of around 40 max-length-padded (5000 byte) column constants.  Try 80 short constants (i.e. IN ('00', '01', '02', '03', '04', .... '77', '78', '79)).
    - Where the maximum column length is sent, this will be too large for a pushed-down condition, and pushdown will not be used (verify with --debug trace)
    - Where the actual data length is sent, this should comfortably fit.
    - Depending on the underlying data size, the difference between applying condition pushdown and not may be evident in the query execution time.

Suggested fix:
Potential workaround : 
  Modify ha_ndbcluster_cond.cc to provide an appropriate length value (not maximum field length) for VAR* types.  Length should include VAR* length bytes.

Better fix  :
  Modify NdbInterpretedCode to only send significant data for VAR* types despite length value passed in.
[22 May 2008 16:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/46962

ChangeSet@1.2599, 2008-05-22 17:22:34+01:00, frazer@forth.ndb.mysql.com +4 -0
  Bug # 35393 Full size of VAR* types is sent in interpreted programs
  
  Only send significant bytes of VAR* values.
[22 May 2008 16:41] Bugs System
Pushed into 5.1.23-ndb-6.4.0
[22 May 2008 16:42] Bugs System
Pushed into 5.1.24-ndb-6.3.15
[22 May 2008 16:45] Bugs System
Pushed into 5.1.24-ndb-6.2.16
[2 Jul 2008 18:53] Jon Stephens
Documented in the NDB 6.2.16 and 6.3.15 changelogs as follows:

        Where column values to be compared in a query were of the
        VARCHAR or VARBINARY types, the NDB handler passed a value
        padded to the full size of the column, which caused unnecessary
        data to be sent to the data nodes. This also had the effect of
        wasting CPU and network bandwidth, and causing condition
        pushdown to be disabled where it could otherwise be applied.
[12 Dec 2008 23:30] Bugs System
Pushed into 6.0.6-alpha  (revid:sp1r-frazer@forth.ndb.mysql.com-20080522162234-23493) (version source revid:jonas@mysql.com-20080808094047-4e1yiarqa2t3opg3) (pib:5)