Bug #18895 BIT values cause joins to fail
Submitted: 7 Apr 2006 17:39 Modified: 22 Jun 2006 16:01
Reporter: Baron Schwartz (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.23-BK, 5.0.15-log OS:Linux (Gentoo Linux)
Assigned to: Georgi Kodinov CPU Architecture:Any

[7 Apr 2006 17:39] Baron Schwartz
Description:
BIT columns in a table cause joins to fail.  The failure depends on the position of the column within the table, other columns in the table (even those not used in the query at all), other rows of data (yes, a row will be joined or not joined depending on the presence of other data in the table which should not interact in the join at all).  I have found a variety of ways in which the behavior is incorrect.

I surmise the way the BIT data is stored in the table may be crossing boundaries into other columns' storage.

How to repeat:
I will attach a detailed test case.  I can repeat this bug on AMD64 and Intel 32-bit architectures.
[7 Apr 2006 17:39] Baron Schwartz
Detailed test case

Attachment: reproduce.sql (text/x-sql), 2.88 KiB.

[10 Apr 2006 11:44] Valeriy Kravchuk
Thank you for a problem report. I've got the following results from your test:

mysql> select left1.col1 as col1,
    -> left1.col2 + 0 as col2,
    -> left1.col3 + 0 as col3,
    -> right1.col1 as col1,
    -> right1.col2 + 0 as col2,
    -> right1.col3 as col3,
    -> right1.col4 as col4
    -> from left1
    ->     left outer join right1 on left1.col1 = right1.col3
    ->         and right1.col2 <> 1
    -> where left1.col2 <> 1
    -> order by left1.col1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
| 1    | 0    | 1    | NULL | NULL | NULL |      |
| 2    | 0    | 1    | NULL | NULL | NULL |      |
+------+------+------+------+------+------+------+
2 rows in set (0.01 sec)

mysql> select *
    -> from left2
    ->     left outer join right1 on left2.col1 = right1.col3
    ->         and right1.col2 <> 1
    -> where left2.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
| 1    |      | ☺    | 1    |      | 1    |      |
+------+------+------+------+------+------+------+
1 row in set (0.00 sec)

mysql> select *
    -> from left1
    ->     left outer join right1 on left1.col1 = right1.col3
    ->         -- and right1.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
| 1    |      | ☺    | 1    |      | 1    |      |
| 2    |      | ☺    | NULL |      | NULL |      |
+------+------+------+------+------+------+------+
2 rows in set (0.01 sec)

mysql> select *
    -> from left1
    ->     left outer join right2 on left1.col1 = right2.col3
    ->         and right2.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 |
+------+------+------+------+------+------+
| 1    |      | ☺    | 1    |      | 1    |
| 2    |      | ☺    | NULL |      | NULL |
+------+------+------+------+------+------+
2 rows in set (0.01 sec)

mysql> select *
    -> from left1
    ->     left outer join right3 on left1.col1 = right3.col3
    ->         and right3.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col4 | col3 |
+------+------+------+------+------+------+------+
| 1    |      | ☺    | 1    |      |      | 1    |
| 2    |      | ☺    | NULL |      |      | NULL |
+------+------+------+------+------+------+------+
2 rows in set (0.00 sec)

Please, specify which of them produce wrong results, from your point of view, and explain what results you expected.
[12 Apr 2006 14:22] Baron Schwartz
All result sets contain the correct number of rows, because the join is a LEFT OUTER join.  The issue is whether the join succeeds, i.e. whether the rows from the right-hand table are output as NULL.  In each case, the join clause matches col1 from the left-hand table to col3 in the right-hand table.  If the values are equal, the right-hand table should *not* be output as NULL values.

The first resultset should output two rows.  The first row should output non-NULL values for the right-hand table, because 1 = 1.  Instead, it outputs NULLs.  This is a bug.  The second row should and does output NULLs, because 2 does not equal 1.

The second resultset should output only one row, because the left-hand table "left2" only has one row.  The join should and does succeed, because 1 = 1.  I included this in the test case because I wanted to demonstrate that the first row in the first resultset (which has exactly the same data) fails, but this one succeeds.  The only difference is the absence of the extra row in the table.  Therefore, I have demonstrated that deleting a row from the left-hand table can cause a failing join to succeed.  This is a bug.  A join between two rows should not depend on other rows being present or absent.

The following result sets should behave as follows: the join should succeed in the first row, and fail in the second.  Each of them does behave this way.  I include them to demonstrate that changing parts of the query, or parts of the tables, causes the join to behave correctly -- but the changes in the query and table should NOT cause different behavior in the join.

The third result set is formed by exactly the same join as the first, but I commented out part of the join clause.  The part I commented out should always be true, so this should not affect the result in any way.  But you can see that it does; it succeeds, where the first result set failed.  This is a bug.

The fourth result set is the same query as the first, except there is one less column in the right-hand table.  The column is not involved in the join, so it should not affect the join.  But it does; the join succeeds where it failed in the first result set.  This is a bug.

The fifth result set is the same as the first, except the fourth column has been moved before the third -- their order has been swapped.  This should not affect the join, but it does.  The join succeeds where it failed in the first result set.  This is a bug.

I neglected to say this, but there are also comments in the test script explaining what it does, so part of this explanation is duplicated in the test script.

I hope this is clear.  The bugs are subtle, so it might not be obvious what is wrong and what is right, but if you look at the data and do the joins "by hand" you will see what should be happening.  I hope I've explained it clearly.

The fact that I can affect how one part of the data (the join) behaves by changing other things (adjacent rows, removing adjacent columns, swapping columns) is why I suspect the bug has something to do with how the data is physically stored.
[2 Jun 2006 15:12] Valeriy Kravchuk
Verified just as described in reproduce.sql test case uploaded, on latest 5.0.23-BK:

mysql> select left1.col1 as col1,
    -> left1.col2 + 0 as col2,
    -> left1.col3 + 0 as col3,
    -> right1.col1 as col1,
    -> right1.col2 + 0 as col2,
    -> right1.col3 as col3,
    -> right1.col4 as col4
    -> from left1
    ->     left outer join right1 on left1.col1 = right1.col3
    ->         and right1.col2 <> 1
    -> where left1.col2 <> 1
    -> order by left1.col1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
|    1 |    0 |    1 | NULL | NULL | NULL | NULL |
|    2 |    0 |    1 | NULL | NULL | NULL | NULL |
+------+------+------+------+------+------+------+
2 rows in set (0.04 sec)

mysql> select *
    -> from left2
    ->     left outer join right1 on left2.col1 = right1.col3
    ->         and right1.col2 <> 1
    -> where left2.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
|    1 |      | ☺    |    1 |      |    1 | NULL |
+------+------+------+------+------+------+------+
1 row in set (0.01 sec)

mysql> select *
    -> from left1
    ->     left outer join right1 on left1.col1 = right1.col3
    ->         -- and right1.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 | col4 |
+------+------+------+------+------+------+------+
|    1 |      | ☺    |    1 |      |    1 | NULL |
|    2 |      | ☺    | NULL | NULL | NULL | NULL |
+------+------+------+------+------+------+------+
2 rows in set (0.00 sec)

mysql> select *
    -> from left1
    ->     left outer join right2 on left1.col1 = right2.col3
    ->         and right2.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col3 |
+------+------+------+------+------+------+
|    1 |      | ☺    |    1 |      |    1 |
|    2 |      | ☺    | NULL | NULL | NULL |
+------+------+------+------+------+------+
2 rows in set (0.01 sec)

mysql> select *
    -> from left1
    ->     left outer join right3 on left1.col1 = right3.col3
    ->         and right3.col2 <> 1
    -> where left1.col2 <> 1;
+------+------+------+------+------+------+------+
| col1 | col2 | col3 | col1 | col2 | col4 | col3 |
+------+------+------+------+------+------+------+
|    1 |      | ☺    |    1 |      | NULL |    1 |
|    2 |      | ☺    | NULL | NULL | NULL | NULL |
+------+------+------+------+------+------+------+
2 rows in set (0.00 sec)
[14 Jun 2006 12:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7632
[19 Jun 2006 0:12] Evgeny Potemkin
Fixed in 5.0.23, 5.1.12
[20 Jun 2006 10:48] Evgeny Potemkin
The Field::eq() considered instances of Field_bit that differ only in 
bit_ptr/bit_ofs equal. This caused equality conditions optimization 
(build_equal_items_for_cond()) to make bad field substitutions that result
in wrong predicates.
[22 Jun 2006 16:01] Paul DuBois
Noted in 5.0.23, 5.1.12 changelogs.

BIT columns in a table could cause joins that use the table to fail.