Bug #30196 Odd/inconsistent behavior in selecting binary index columns ending with x'20'
Submitted: 2 Aug 2007 5:52 Modified: 3 Aug 2007 18:16
Reporter: Zibeli Aton Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Packaging Severity:S3 (Non-critical)
Version:5.0.45, 5.0-community, 6.0-falcon OS:Any
Assigned to: CPU Architecture:Any
Tags: BINARY, key, SELECT, space

[2 Aug 2007 5:52] Zibeli Aton
Description:
While creating an application that used a binary(16) indexed column containing md5 sums, I discovered that the results of a "SELECT * WHERE BinaryIndexColumn='literal'" did not return matching rows when the column (and search literal) ended with an ascii space (x'20').

Further testing showed that this behavior only happens under certain, seemingly unrelated, conditions.  For example:

1.  If the indexed binary column is created without specifying "NOT NULL", items ending with x'20' are matched as expected.

2.  If the length of the binary column is less than 9 bytes, items ending with x'20' are matched as expected.

3.  If the where clause included a BINARY cast (i.e. "WHERE BINARY BinaryIndexColumn='literal'", items ending with x'20' are matched as expected.  (Though the index is not used, which is consistent with manual description of the BINARY cast but made my application painfully slow.)

How to repeat:
The below were all verified both on the Windows version (using binary downloaded from MySQL site) and a version compiled from source on a GNU/Linux box.  The default ENGINE for all was MyISAM and the DEFAULT CHARSET was latin1:

mysql> CREATE TEMPORARY TABLE bug (bincol BINARY(9) NOT NULL, KEY (bincol));
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO bug VALUES('AAAAAAAA '),('AAAAAAAAA');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM bug WHERE bincol='AAAAAAAA ';
Empty set (0.00 sec)

However:

mysql> SELECT * FROM bug WHERE BINARY bincol='AAAAAAAA ';
+-----------+
| bincol    |
+-----------+
| AAAAAAAA  |
+-----------+
1 row in set (0.00 sec)

Also, strangely, when making the column 8 bytes (or less) instead of 9:

mysql> DROP TABLE bug;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TEMPORARY TABLE bug (bincol BINARY(8) NOT NULL, KEY (bincol));
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO bug VALUES('AAAAAAA '),('AAAAAAAA');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM bug WHERE bincol='AAAAAAA ';
+----------+
| bincol   |
+----------+
| AAAAAAA  |
+----------+
1 row in set (0.00 sec)

And oddly, excluding the "NOT NULL" also changes the behavior:

mysql> DROP TABLE bug;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TEMPORARY TABLE bug (bincol BINARY(9), KEY (bincol));
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO bug VALUES('AAAAAAAA '),('AAAAAAAAA');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM bug WHERE bincol='AAAAAAAA ';
+-----------+
| bincol    |
+-----------+
| AAAAAAAA  |
+-----------+
1 row in set (0.00 sec)

Suggested fix:
I've worked around my issue for now by using the hex version of the MD5 sum in a CHAR(32) column instead of the BINARY(16) version, at the cost of a few hundred megabytes of disk space as it's a pretty large table.
[2 Aug 2007 7:52] Sveta Smirnova
Please do not submit the same bug more than once. An existing bug report already describes this very problem. Even if you feel that your issue is somewhat different, the resolution is likely
to be the same. Because of this, we hope you add your comments to the original bug instead.

Thank you for your interest in MySQL.

Duplicate of bug #29087 Please wait next Community release.
[2 Aug 2007 9:33] Sveta Smirnova
Thank you again for the report.

Due to internal discussion we decided reclassify this report as packaging problem.
[2 Aug 2007 9:47] Zibeli Aton
Thanks for your quick response and sorry for the apparent duplicate.  I actually did do some searches in the bug database before submitting the report, but likely stopped reading #29087 when I saw "server crash" since that wasn't happening to me, apparently because I wasn't using debug builds.

Could you clarify what the reclassification as a "packaging" problem means?  My guess is that it means the patch to fix #29087 didn't get into the source used for the Windows build or the source I downloaded and built on the GNU/Linux box and that applying that patch to my 5.0.45 source and rebuilding should fix it.  Is that correct?

Thanks again!
[2 Aug 2007 9:59] Sveta Smirnova
Fix for bug #29087 does not exist in source trees which we use for Community builds.
[2 Aug 2007 10:14] Zibeli Aton
Got it.  Thank you much, Sveta!
[2 Aug 2007 20:14] Daniel Fischer
The fix will get into mysql-5.0-community when the community engineering team pulls in the enterprise source code next time. As we just had a community release (5.0.45), this didn't happen yet. This is expected and standard procedure. I can't give an ETA, but bugs that are fixed in 5.0.46 will definitely also be fixed in the next 5.0 community build.
[3 Aug 2007 18:16] Trudy Pelzer
Setting status back to "duplicate" as this
is a solved issue. Please watch for the
updated community version.