Bug #35393 MySQL Cluster condition pushdown sends full length for VAR* types
Submitted: 18 Mar 2008 13:45 Modified: 2 Jul 2008 20:53
Reporter: Frazer Clement
Status: Closed
Category:Server: Cluster Severity:S5 (Performance)
Version:5.1 OS:Any
Assigned to: Frazer Clement Target Version:

[18 Mar 2008 13:45] Frazer Clement
Description:
MySQL Cluster condition pushdown uses NdbScanFilter to describe conditions that rows must
meet in terms of comparisons with column value constants.

Where the column value constants are of variable length type (VARCHAR, VARBINARY), the
NDB handler passes a value which is padded to the maximum fieldlength.  Where the actual
length of the VAR column is less than the maximum, this results in useless data being
sent to the NDBD nodes.

This has the effect of wasted CPU in memcpy, wasting network bandwidth and, given that
the interpreted program has a finite length, after which no condition pushdown is
applied, it can result in condition pushdown being prematurely disabled.

How to repeat:
1 ) Create a table with a non-indexed large VARCHAR column (say 5000 bytes).
2 ) SELECT from the table with an IN clause listing a set of short values which this
column's value must be in.
    - Check with EXPLAIN that condition pushdown is being used.
    - Maximum program size is 200000 bytes.  This could be enough for an IN list of
around 40 max-length-padded (5000 byte) column constants.  Try 80 short constants (i.e.
IN ('00', '01', '02', '03', '04', .... '77', '78', '79)).
    - Where the maximum column length is sent, this will be too large for a pushed-down
condition, and pushdown will not be used (verify with --debug trace)
    - Where the actual data length is sent, this should comfortably fit.
    - Depending on the underlying data size, the difference between applying condition
pushdown and not may be evident in the query execution time.

Suggested fix:
Potential workaround : 
  Modify ha_ndbcluster_cond.cc to provide an appropriate length value (not maximum field
length) for VAR* types.  Length should include VAR* length bytes.

Better fix  :
  Modify NdbInterpretedCode to only send significant data for VAR* types despite length
value passed in.
[22 May 2008 18:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/46962

ChangeSet@1.2599, 2008-05-22 17:22:34+01:00, frazer@forth.ndb.mysql.com +4 -0
  Bug # 35393 Full size of VAR* types is sent in interpreted programs
  
  Only send significant bytes of VAR* values.
[22 May 2008 18:41] Bugs System
Pushed into 5.1.23-ndb-6.4.0
[22 May 2008 18:42] Bugs System
Pushed into 5.1.24-ndb-6.3.15
[22 May 2008 18:45] Bugs System
Pushed into 5.1.24-ndb-6.2.16
[2 Jul 2008 20:53] Jon Stephens
Documented in the NDB 6.2.16 and 6.3.15 changelogs as follows:

        Where column values to be compared in a query were of the
        VARCHAR or VARBINARY types, the NDB handler passed a value
        padded to the full size of the column, which caused unnecessary
        data to be sent to the data nodes. This also had the effect of
        wasting CPU and network bandwidth, and causing condition
        pushdown to be disabled where it could otherwise be applied.
[13 Dec 2008 0:30] Bugs System
Pushed into 6.0.6-alpha  (revid:sp1r-frazer@forth.ndb.mysql.com-20080522162234-23493)
(version source revid:jonas@mysql.com-20080808094047-4e1yiarqa2t3opg3) (pib:5)