Bug #885 LIKE doesn't return the same results on InnoDB as on MyISAM
Submitted: 22 Jul 2003 4:41 Modified: 24 Jan 2004 12:54
Reporter: Daniel Penning Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:4.1.0 OS:Windows (Windows 2000)
Assigned to: Heikki Tuuri CPU Architecture:Any

[22 Jul 2003 4:41] Daniel Penning
Description:
When searching on a binary varchar column InnoDB doesn't return the expected results.
MyISAM works correctly.

How to repeat:
DROP TABLE test;
CREATE TABLE test (path VARCHAR (255) BINARY NOT NULL, UNIQUE(path)) type=MyISAM;
INSERT INTO test VALUES(0x01),(0x0102),(0x0103),(0x010204);
SELECT * FROM test WHERE path LIKE 0x010225;
    # returns 2 rows as expected

ALTER TABLE test type=InnoDB;
SELECT * FROM test WHERE path LIKE 0x010225;
    # returns 1 row

SELECT * FROM test WHERE BINARY path LIKE 0x010225;
    # returns 2 rows but uses no index :(

Suggested fix:
-
[24 Jul 2003 3:48] Heikki Tuuri
Hi!

The reason is that when InnoDB stores BINARY VARCHAR types, it strips trailing spaces from the end of the string. That is necessary, because internally MySQL always pads them with spaces to the max length. Also, when InnoDB internally compares two BINARY strings, it conceptually pads the shorter with spaces and does the comparison after that.

When MySQL looks for strings satisfying LIKE 'abc%' it pads the search string with spaces at the end: 'abc          ' and looks for strings >= than that. InnoDB strips those spaces also from the end of the search string. Inside InnoDB character codes < 32 are sorted BEFORE the space ' ' whose code is 32. That is why LIKE will not find those strings.

Possible solutions: MySQL should pad with the zero character 0x00 when doing a LIKE search on InnoDB. Or why not for any table type?

Or InnoDB might do the conversion itself. Have to think about that.

The problems stem from the fact that before 5.x the VARCHAR in MySQL is not really a VARCHAR. It does not store the string length. That is why we need the stripping of trailing spaces.

Regards,

Heikki
[24 Jan 2004 12:54] MySQL Verification Team
Tested against a server from latest bk tree 4.1:

mysql> ALTER TABLE test type=InnoDB;
Query OK, 4 rows affected, 2 warnings (0.10 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM test WHERE path LIKE 0x010225;
+------+
| path |
+------+
| &#9786;&#9787;   |
| &#9786;&#9787;&#9830;  |
+------+
2 rows in set (0.01 sec)

mysql> select version();
+-----------------------+
| version()             |
+-----------------------+
| 4.1.2-alpha-max-debug |
+-----------------------+
1 row in set (0.00 sec)