Bug #18895 BIT values cause joins to fail
Submitted: 7 Apr 2006 19:39 Modified: 22 Jun 2006 18:01
Reporter: Baron Schwartz (Basic Quality Contributor)
Status: Closed
Category:Server Severity:S2 (Serious)
Version:5.0.23-BK, 5.0.15-log OS:Linux (Gentoo Linux)
Assigned to: Georgi Kodinov Target Version:

[7 Apr 2006 19:39] Baron Schwartz
Description:
BIT columns in a table cause joins to fail.  The failure depends on the position of the
column within the table, other columns in the table (even those not used in the query at
all), other rows of data (yes, a row will be joined or not joined depending on the
presence of other data in the table which should not interact in the join at all).  I
have found a variety of ways in which the behavior is incorrect.

I surmise the way the BIT data is stored in the table may be crossing boundaries into
other columns' storage.

How to repeat:
I will attach a detailed test case.  I can repeat this bug on AMD64 and Intel 32-bit
architectures.
[7 Apr 2006 19:39] Baron Schwartz
Detailed test case

Attachment: reproduce.sql (text/x-sql), 2.88 KiB.

[10 Apr 2006 13:44] Valeriy Kravchuk
Thank you for a problem report. I've got the following results from your test:

mysql> select left1.col1 as col1,
    -> left1.col2 + 0 as col2,
    -> left1.col3 + 0 as col3,
    -> right1.col1 as col1,
    -> right1.col2 + 0 as col2,
    -> right1.col3 as col3,
    -> right1.col4 as col4
    -> from left1
    ->     left outer join right1 on left1.col1 = right1.col3
    ->         and right1.col2 <> 1
    -> where left1.col2 <> 1
    -> order by left1.col1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
| 1    | 0    | 1    | NULL | NULL | NULL |      |
| 2    | 0    | 1    | NULL | NULL | NULL |      |
+------+------+------+------+------+------+------+
2 rows in set (0.01 sec)

mysql> select *
    -> from left2
    ->     left outer join right1 on left2.col1 = right1.col3
    ->         and right1.col2 <> 1
    -> where left2.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
| 1    |      | ☺    | 1    |      | 1    |      |
+------+------+------+------+------+------+------+
1 row in set (0.00 sec)

mysql> select *
    -> from left1
    ->     left outer join right1 on left1.col1 = right1.col3
    ->         -- and right1.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
| 1    |      | ☺    | 1    |      | 1    |      |
| 2    |      | ☺    | NULL |      | NULL |      |
+------+------+------+------+------+------+------+
2 rows in set (0.01 sec)

mysql> select *
    -> from left1
    ->     left outer join right2 on left1.col1 = right2.col3
    ->         and right2.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 |
+------+------+------+------+------+------+
| 1    |      | ☺    | 1    |      | 1    |
| 2    |      | ☺    | NULL |      | NULL |
+------+------+------+------+------+------+
2 rows in set (0.01 sec)

mysql> select *
    -> from left1
    ->     left outer join right3 on left1.col1 = right3.col3
    ->         and right3.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col4 | col3 |
+------+------+------+------+------+------+------+
| 1    |      | ☺    | 1    |      |      | 1    |
| 2    |      | ☺    | NULL |      |      | NULL |
+------+------+------+------+------+------+------+
2 rows in set (0.00 sec)

Please, specify which of them produce wrong results, from your point of view, and explain
what results you expected.
[12 Apr 2006 16:22] Baron Schwartz
All result sets contain the correct number of rows, because the join is a LEFT OUTER join.
 The issue is whether the join succeeds, i.e. whether the rows from the right-hand table
are output as NULL.  In each case, the join clause matches col1 from the left-hand table
to col3 in the right-hand table.  If the values are equal, the right-hand table should
*not* be output as NULL values.

The first resultset should output two rows.  The first row should output non-NULL values
for the right-hand table, because 1 = 1.  Instead, it outputs NULLs.  This is a bug.  The
second row should and does output NULLs, because 2 does not equal 1.

The second resultset should output only one row, because the left-hand table "left2" only
has one row.  The join should and does succeed, because 1 = 1.  I included this in the
test case because I wanted to demonstrate that the first row in the first resultset
(which has exactly the same data) fails, but this one succeeds.  The only difference is
the absence of the extra row in the table.  Therefore, I have demonstrated that deleting
a row from the left-hand table can cause a failing join to succeed.  This is a bug.  A
join between two rows should not depend on other rows being present or absent.

The following result sets should behave as follows: the join should succeed in the first
row, and fail in the second.  Each of them does behave this way.  I include them to
demonstrate that changing parts of the query, or parts of the tables, causes the join to
behave correctly -- but the changes in the query and table should NOT cause different
behavior in the join.

The third result set is formed by exactly the same join as the first, but I commented out
part of the join clause.  The part I commented out should always be true, so this should
not affect the result in any way.  But you can see that it does; it succeeds, where the
first result set failed.  This is a bug.

The fourth result set is the same query as the first, except there is one less column in
the right-hand table.  The column is not involved in the join, so it should not affect
the join.  But it does; the join succeeds where it failed in the first result set.  This
is a bug.

The fifth result set is the same as the first, except the fourth column has been moved
before the third -- their order has been swapped.  This should not affect the join, but
it does.  The join succeeds where it failed in the first result set.  This is a bug.

I neglected to say this, but there are also comments in the test script explaining what
it does, so part of this explanation is duplicated in the test script.

I hope this is clear.  The bugs are subtle, so it might not be obvious what is wrong and
what is right, but if you look at the data and do the joins "by hand" you will see what
should be happening.  I hope I've explained it clearly.

The fact that I can affect how one part of the data (the join) behaves by changing other
things (adjacent rows, removing adjacent columns, swapping columns) is why I suspect the
bug has something to do with how the data is physically stored.
[2 Jun 2006 17:12] Valeriy Kravchuk
Verified just as described in reproduce.sql test case uploaded, on latest 5.0.23-BK:

mysql> select left1.col1 as col1,
    -> left1.col2 + 0 as col2,
    -> left1.col3 + 0 as col3,
    -> right1.col1 as col1,
    -> right1.col2 + 0 as col2,
    -> right1.col3 as col3,
    -> right1.col4 as col4
    -> from left1
    ->     left outer join right1 on left1.col1 = right1.col3
    ->         and right1.col2 <> 1
    -> where left1.col2 <> 1
    -> order by left1.col1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
|    1 |    0 |    1 | NULL | NULL | NULL | NULL |
|    2 |    0 |    1 | NULL | NULL | NULL | NULL |
+------+------+------+------+------+------+------+
2 rows in set (0.04 sec)

mysql> select *
    -> from left2
    ->     left outer join right1 on left2.col1 = right1.col3
    ->         and right1.col2 <> 1
    -> where left2.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
|    1 |      | ☺    |    1 |      |    1 | NULL |
+------+------+------+------+------+------+------+
1 row in set (0.01 sec)

mysql> select *
    -> from left1
    ->     left outer join right1 on left1.col1 = right1.col3
    ->         -- and right1.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
|    1 |      | ☺    |    1 |      |    1 | NULL |
|    2 |      | ☺    | NULL | NULL | NULL | NULL |
+------+------+------+------+------+------+------+
2 rows in set (0.00 sec)

mysql> select *
    -> from left1
    ->     left outer join right2 on left1.col1 = right2.col3
    ->         and right2.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 |
+------+------+------+------+------+------+
|    1 |      | ☺    |    1 |      |    1 |
|    2 |      | ☺    | NULL | NULL | NULL |
+------+------+------+------+------+------+
2 rows in set (0.01 sec)

mysql> select *
    -> from left1
    ->     left outer join right3 on left1.col1 = right3.col3
    ->         and right3.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col4 | col3 |
+------+------+------+------+------+------+------+
|    1 |      | ☺    |    1 |      | NULL |    1 |
|    2 |      | ☺    | NULL | NULL | NULL | NULL |
+------+------+------+------+------+------+------+
2 rows in set (0.00 sec)
[14 Jun 2006 14:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7632
[19 Jun 2006 2:12] Evgeny Potemkin
Fixed in 5.0.23, 5.1.12
[20 Jun 2006 12:48] Evgeny Potemkin
The Field::eq() considered instances of Field_bit that differ only in 
bit_ptr/bit_ofs equal. This caused equality conditions optimization 
(build_equal_items_for_cond()) to make bad field substitutions that result
in wrong predicates.
[22 Jun 2006 18:01] Paul DuBois
Noted in 5.0.23, 5.1.12 changelogs.

BIT columns in a table could cause joins that use the table to fail.