MySQL Bugs: #35835: Cluster: BKA Join Buffer returns extra result rows.

Bug #35835	Cluster: BKA Join Buffer returns extra result rows.
Submitted:	4 Apr 2008 16:51	Modified:	20 Nov 2010 23:33
Reporter:	Jonathan Miller	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S2 (Serious)
Version:	mysql-6.0-bka-preview	OS:	Linux
Assigned to:	Igor Babaev	CPU Architecture:	Any

Description:
Hi,

This is a weird one. I have tried to break it down the best I can, yet modifing the test file at all makes this go away.

With the attached test file if ran with BKA on cluster you get an extra row of data.

* if you run same test using myisam, extra row is not there
* if you modify the test file (i.e. remove some tests before offending query) and use cluster w/ bka, extra row is not there
* if you run same test w/ cluster but turn of BKA, extra row is not there.
* if you run same test w/ cluster in 5.0.60, extra row is not there.
* if you run same test w/ cluster in 5.1.25, extra row is not there.

 1      1       3       2       1       1       NULL    NULL
 2      2       1       2       2       2       0       2
 2      2       3       2       2       2       0       2
-2      2       NULL    NULL    NULL    NULL    NULL    NULL
 3      3       NULL    NULL    NULL    NULL    NULL    NULL

Additionally, with BKA turned off under cluster a different query is missing many rows that are returned with both myisam and Cluster w/BKA

 SELECT a, b, c FROM t1 LEFT JOIN (t2, t3) ON b < 3 and b = c;
 a      b       c
-0      NULL    NULL
-1      NULL    NULL
-10     NULL    NULL
-11     NULL    NULL
-12     NULL    NULL
-13     NULL    NULL
-14     NULL    NULL
-15     NULL    NULL
-16     NULL    NULL
-17     NULL    NULL
-18     NULL    NULL
-19     NULL    NULL
-2      NULL    NULL
-3      NULL    NULL
-4      NULL    NULL
-5      NULL    NULL
-6      NULL    NULL
-7      NULL    NULL
-8      NULL    NULL
-9      NULL    NULL

 

How to repeat:
Use the attached test file.

Set engine_type to ndbcluster and run:

./mysql-test-run.pl --do-test=ndb_new --with-ndbcluster

or

Set engine_type to myisam and run:

./mysql-test-run.pl --do-test=ndb_new

Test Case

Attachment: ndb_new.test (application/octet-stream, text), 31.60 KiB.

Cluster Results BKA Clone w/ BKA == on

Attachment: ndb_new.result (application/octet-stream, text), 30.09 KiB.

Reject file from BKA clone using cluster w/ BKA == off

Attachment: ndb_new.reject (application/octet-stream, text), 29.83 KiB.

Cluster 5.0 Reject File

Attachment: ndb_new.5.0.reject (application/octet-stream, text), 33.60 KiB.

Cluster 5.1 Reject File

Attachment: ndb_new.5.1.reject (application/octet-stream, text), 33.62 KiB.

Hi, I spent time retesting this today on the bka-preview clone (thanks to Igor for sending me the information on the join_cache_level).

The following query should return 2 rows for t5a and t5b (1,1), 2 rows for t5a and t5b (2,2) and one row for t5a and t5b (3,3), yet under BKA using cluster as describled when the bug was opened, it returns an additional row for t5a and t5b (2,2) with nulls for the rest of the columns

Data:
INSERT INTO t5 VALUES (1,1,0), (2,2,0), (3,3,0);
INSERT INTO t6 VALUES (1,2,0), (3,2,0), (6,1,0);
INSERT INTO t7 VALUES (1,1,0), (2,2,0);
INSERT INTO t8 VALUES (0,2,0), (1,2,0);

Query
SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
FROM t5 LEFT JOIN((t6, t7)LEFT JOIN t8 ON t7.b=t8.b AND t6.b < 10)
ON t6.b >= 2 AND t5.b=t7.b  AND (t8.a < 1 OR t8.c IS NULL);

Query (join_cache_level=6)
a       b       a       b       a       b       a       b
1       1       1       2       1       1       NULL    NULL
1       1       3       2       1       1       NULL    NULL
2       2       1       2       2       2       0       2
2       2       3       2       2       2       0       2
->2       2       NULL    NULL    NULL    NULL    NULL    NULL
3       3       NULL    NULL    NULL    NULL    NULL    NULL

myisam results:
+-----+------+------+------+------+------+------+------+
| t5a | t5b  | t6a  | t6b  | t7a  | t7b  | t8a  | t8b  |
+-----+------+------+------+------+------+------+------+
|   1 |    1 |    1 |    2 |    1 |    1 | NULL | NULL |
|   1 |    1 |    3 |    2 |    1 |    1 | NULL | NULL |
|   2 |    2 |    1 |    2 |    2 |    2 |    0 |    2 |
|   2 |    2 |    3 |    2 |    2 |    2 |    0 |    2 |
|   3 |    3 | NULL | NULL | NULL | NULL | NULL | NULL |
+-----+------+------+------+------+------+------+------+

So it returns two rows for 1 and 2 due to matching values, and following the left join rule it returns the last row out of t5 eventhough there are no matching values.

The result of a left outer join for tables A and B always contains all records of the "left" table (A), even if the join-condition does not find any matching record in the "right" table (B). This means that if the ON clause matches 0 (zero) records in B, the join will still return a row in the result—but with NULL in each column from B. This means that a left outer join returns all the values from the left table, plus matched values from the right table (or NULL in case of no matching join predicate).

+SET join_cache_level=5;
+SHOW VARIABLES LIKE "%join_cache%";
+Variable_name  Value
+join_cache_level       5
 DROP TABLE IF EXISTS t0,t1,t2,t3,t4,t5,t6,t7,t8,t9;
 ******************
 ** Join Test #1 **
@@ -299,7 +303,6 @@
 1      1       3       2       1       1       NULL    NULL
 2      2       1       2       2       2       0       2
 2      2       3       2       2       2       0       2
-2      2       NULL    NULL    NULL    NULL    NULL    NULL
 3      3       NULL    NULL    NULL    NULL    NULL    NULL

+SET join_cache_level=4;
+SHOW VARIABLES LIKE "%join_cache%";
+Variable_name  Value
+join_cache_level       4
 DROP TABLE IF EXISTS t0,t1,t2,t3,t4,t5,t6,t7,t8,t9;
 ******************
 ** Join Test #1 **
@@ -299,7 +303,6 @@
 1      1       3       2       1       1       NULL    NULL
 2      2       1       2       2       2       0       2
 2      2       3       2       2       2       0       2
-2      2       NULL    NULL    NULL    NULL    NULL    NULL
 3      3       NULL    NULL    NULL    NULL    NULL    NULL

Updated results of BKA cluster with JCL=6

Attachment: ndb_new-jcl6.result (application/octet-stream, text), 30.18 KiB.

Reject file from BKA using cluster JLC = 5

Attachment: ndb_new-jcl5.reject (application/octet-stream, text), 33.76 KiB.

Reject file from BKA using cluster JLC = 4

Attachment: ndb_new-jcl4.reject (application/octet-stream, text), 33.50 KiB.

Reject file from BKA using cluster JLC = 3

Attachment: ndb_new-jcl3.reject (application/octet-stream, text), 33.74 KiB.

Reject file from BKA using cluster JLC = 2

Attachment: ndb_new-jcl2.reject (application/octet-stream, text), 33.74 KiB.

Reject file from BKA using cluster JLC = 1

Attachment: ndb_new-jcl1.reject (application/octet-stream, text), 33.74 KiB.

Reject file from BKA using cluster JLC = 0

Attachment: ndb_new-jcl0.reject (application/octet-stream, text), 33.74 KiB.

Reduced test case showing extra row returned

Attachment: ndb_new.test (application/octet-stream, text), 3.63 KiB.

I attached a reduced test case.

The test case does some dummy create table + insert + drop table before the select that returns wrong results. Removing any of these makes the bug go away.

Correct result file for reduced test case

Attachment: ndb_new.result (application/octet-stream, text), 2.29 KiB.

Ok, I found the problem.

The join is executed with four scans, in the order t5, t7, t6, t8.

The scan on t8 is a multi-range-read (MRR) scan. It takes four ranges:

  0: b=1
  1: b=2
  2: b=1
  3: b=2

So note that the ranges are overlapping.

The MRR scan on t8 is marked as _not_ sorted (HA_MRR_SORTED is not set). Because 
of this, NDB does not preserve the order of ranges when returning the rows, and
rows are returned like this (in this particular example):

  row 0: range=1 (a,b,c)=(1,2,0)
  row 1: range=3 (a,b,c)=(1,2,0)
  row 2: range=1 (a,b,c)=(0,2,0)
  row 3: range=3 (a,b,c)=(0,2,0)

But the BKA join algorithm in the optimizer seems to expect rows to be returned
in strict range order, ie. all rows from range=1 returned before first row from
range=3. So since NDB does not follow this, we get wrong results.

I hacked the code to force sorted order, and the problem disappears. It makes
sense anyway that the outer join algorithm needs rows in range order to
correctly determine if the extra NULL row for "no match" should be added.

So I think this needs to be fixed in the optimizer / BKA join implementation so
that it sets the HA_MRR_SORTED flag for the MRR scan when rows are required in
strict range order.

A more advanced solution might be to introduce a flag for relaxed sorting,
where the storage engine is required to sort the ranges, but the rows within
a single range need not be sorted. This would allow engines to be more efficient
than a full sort (ie. NDB would only need to do its merge sort on the
range_number field, not on index fields). But it would require more work to
implement, so is really a feature request.

This bug does not seem to be specific for Cluster, nor related to the sort order of the table. By slightly modifying the SQL statement this bug is consistently reproducable with any storage engine. Modified case:

 CREATE TABLE t5 (a int, b int, c int, 
         PRIMARY KEY(a), KEY b_i (b))ENGINE=myisam;
 CREATE TABLE t6 (a int, b int, c int, 
         PRIMARY KEY(a), KEY b_i (b))ENGINE=myisam;
 CREATE TABLE t7 (a int, b int, c int, 
         PRIMARY KEY(a), KEY b_i (b))ENGINE=myisam;
 CREATE TABLE t8 (a int, b int, c int, 
         PRIMARY KEY(a), KEY b_i (b))ENGINE=myisam;

INSERT INTO t5 VALUES (1,1,0), (2,2,0), (3,3,0);
INSERT INTO t6 VALUES (1,2,0), (3,2,0), (6,1,0);
INSERT INTO t7 VALUES (1,1,0), (2,2,0);
INSERT INTO t8 VALUES (0,2,0), (1,2,0);
COMMIT;

SELECT t5.a,t5.b,t6.a,t6.b,t7.a,t7.b,t8.a,t8.b
  FROM t5 
       LEFT JOIN 
       ( 
         (t6, t7)
         LEFT JOIN 
         t8
         ON t7.b=t8.b AND t6.b < 10
       )
       ON t6.b >= 2 AND t5.b=t7.b AND
          (t8.a > 0 OR t8.c IS NULL);

Tables t5-t8 are identical to the previous cases. However, the where clause on t8 has been changed from 't8.a < 1' to 't8.a > 0'. When executing this query with 'join_cache_level=6' it incorrectly always returns an extra NULL-complemented row for t5.a=2

+---+------+------+------+------+------+------+------+------+
| a | b    | a    | b    | a    | b    | a    | b    | c    |
+---+------+------+------+------+------+------+------+------+
| 2 |    2 |    1 |    2 |    2 |    2 |    0 |    2 |    0 |
| 2 |    2 |    3 |    2 |    2 |    2 |    0 |    2 |    0 |
| 1 |    1 |    1 |    2 |    1 |    1 | NULL | NULL | NULL |
| 1 |    1 |    3 |    2 |    1 |    1 | NULL | NULL | NULL |
| 2 |    2 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | <<<<
| 3 |    3 | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+---+------+------+------+------+------+------+------+------+

I have extracted and simplified another test case from the attached ndb_new.test which seems to be unrelated with the former simplified test case:

drop table t1,t2,t3,t5;
CREATE TABLE t1 (a int, b int, c int, 
PRIMARY KEY(a), KEY b_i (b))ENGINE=myisam;
CREATE TABLE t2 (a int, b int, c int, 
PRIMARY KEY(a), KEY b_i (b))ENGINE=myisam;
CREATE TABLE t3 (a int, b int, c int, 
PRIMARY KEY(a), KEY b_i (b))ENGINE=myisam;
CREATE TABLE t5 (a int, b int, c int, 
PRIMARY KEY(a), KEY b_i (b))ENGINE=myisam;

INSERT INTO t1 VALUES (1,3,0), (3,2,0);
INSERT INTO t2 VALUES (3,3,0);
INSERT INTO t3 VALUES (1,2,0);
INSERT INTO t5 VALUES (1,1,0);
COMMIT;

SELECT t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t5.a,t5.b
  FROM t1 LEFT JOIN                
       (t2 LEFT JOIN t3 ON t3.a=1, t5)
       ON (t1.b=t5.b);

Excuting this query with JCL=1 gives the correct result:

+---+------+------+------+------+------+------+------+
| a | b    | a    | b    | a    | b    | a    | b    |
+---+------+------+------+------+------+------+------+
| 1 |    3 | NULL | NULL | NULL | NULL | NULL | NULL |
| 3 |    2 | NULL | NULL | NULL | NULL | NULL | NULL |
+---+------+------+------+------+------+------+------+

When setting JCL=6 we get an empty result set. Furthermore, if the query is slightly rewritten to the equivalent query (Swapping order of (t5, t2 Left J...):

SELECT t1.a,t1.b,t2.a,t2.b,t3.a,t3.b,t5.a,t5.b
  FROM t1 LEFT JOIN                
       (t5, t2 LEFT JOIN t3 ON t3.a=1)
       ON (t1.b=t5.b);

+---+------+------+------+------+------+------+------+
| a | b    | a    | b    | a    | b    | a    | b    |
+---+------+------+------+------+------+------+------+
| 1 |    3 | NULL | NULL |    1 |    2 | NULL | NULL | 
| 3 |    2 | NULL | NULL |    1 |    2 | NULL | NULL | 
+---+------+------+------+------+------+------+------+

Which is completely wrong.....!

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55357

2648 Igor Babaev	2008-10-05
      Fixed several bugs reported in the bug entry of bug #35835.
      All these bugs concerned null complemented rows in the
      result sets of queries with nested outer joins and are 
      closely interconnected.
      
      Added test cases for bug #35835.
      Fixed a wrong result set for one of the previous tests.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55845

2648 Igor Babaev	2008-10-08
      Fixed several bugs reported in the bug entry of bug #35835.
      All these bugs concerned null complemented rows in the
      result sets of queries with nested outer joins and are 
      closely interconnected.
            
      Added test cases for bug #35835.
      Fixed a wrong result set for one of the previous tests.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56498

2655 Igor Babaev	2008-10-17
      Correction for the patch to fix bug #35835.

Pushed into 6.0.8-alpha  (revid:igor@mysql.com-20081018062335-4ohqg0jc20uri49a) (version source revid:igor@mysql.com-20081018062335-4ohqg0jc20uri49a) (pib:5)

Noted in 6.0.8 changelog.

Processing for NULL-complemented rows in the result sets of queries
with nested outer joins could be incorrect.

Correction: This is pushed into 6.0.9.

Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)

Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)

Noted in 5.6.1 changelog.

Correction: No 5.6.1 changelog entry. Bug does not appear in any released 5.6.x version.