Bug #4398 false string functions on binary textfields
Submitted: 3 Jul 2004 19:07 Modified: 23 Jul 2004 0:17
Reporter: Reini Urban Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S3 (Non-critical)
Version:mysql >= 4.1 OS:Windows (w2k)
Assigned to: Matthew Lord CPU Architecture:Any

[3 Jul 2004 19:07] Reini Urban
Description:
string-case functions in the WHERE clause with binary textfields are ignored.

CREATE TABLE page (
	id              INT NOT NULL AUTO_INCREMENT,
        pagename        VARCHAR(100) BINARY NOT NULL,
        PRIMARY KEY (id),
	UNIQUE KEY (pagename)
) TYPE=MyISAM;
INSERT INTO page VALUES (NULL,'TestString');
SELECT * FROM page WHERE LOWER(pagename) LIKE 'test%';

=> Empty set (0.00 sec)

'TestString' should have been found. This worked for all previous releases until 4.1.x

Affected mysql versions (at least): 
4.1.0alpha, 4.1.3-beta, 5.0.0a

Note: related to bug #4340, but not fixed with 4.1.1

How to repeat:
CREATE TABLE page (
	id              INT NOT NULL AUTO_INCREMENT,
        pagename        VARCHAR(100) BINARY NOT NULL,
        PRIMARY KEY (id),
	UNIQUE KEY (pagename)
) TYPE=MyISAM;
INSERT INTO page VALUES (NULL,'TestString');
SELECT * FROM page WHERE LOWER(pagename) LIKE 'test%';

Suggested fix:
revert the premature BINARY optimization.
[3 Jul 2004 19:18] Reini Urban
Oops,
I meant "related to http://bugs.mysql.com/bug.php?id=1491
but not fixed with 4.1.1"
Bug 4340 might be related. This is still open, but appeared in 4.0.21, which I haven't tested. 4.0.20a works ok.
[6 Jul 2004 12:45] Hartmut Holzgraefe
I can't repoduce this with 4.1.3.

Can you please add your language and collation settings to the report?

  SHOW VARIABLES LIKE 'char%';
  SHOW VARIABLES LIKE 'col%'
[6 Jul 2004 13:01] Reini Urban
mysql> select version();
+-------------------+
| version()         |
+-------------------+
| 4.1.3-beta-nt-max |
+-------------------+
1 row in set (0.11 sec)
mysql> SHOW VARIABLES LIKE 'char%';
+--------------------------+------------------------------------+
| Variable_name            | Value                              |
+--------------------------+------------------------------------+
| character_set_client     | latin1                             |
| character_set_connection | latin1                             |
| character_set_database   | latin1                             |
| character_set_results    | latin1                             |
| character_set_server     | latin1                             |
| character_set_system     | utf8                               |
| character_sets_dir       | F:\Programme\mysql\share\charsets/ |
+--------------------------+------------------------------------+
7 rows in set (0.03 sec)
mysql> SHOW VARIABLES LIKE 'col%';
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database   | latin1_swedish_ci |
| collation_server     | latin1_swedish_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)
mysql> SELECT * FROM page WHERE LOWER(pagename) LIKE 'test%';
Empty set (0.00 sec)
------------------

mysql> select version();
+--------------------+
| version()          |
+--------------------+
| 5.0.0-alpha-max-nt |
+--------------------+
1 row in set (0.20 sec)

mysql> SHOW VARIABLES LIKE 'char%';
+--------------------------+------------------------------------+
| Variable_name            | Value                              |
+--------------------------+------------------------------------+
| character_set_server     | latin1                             |
| character_set_system     | utf8                               |
| character_set_database   | latin1                             |
| character_set_client     | latin1                             |
| character_set_connection | latin1                             |
| character-sets-dir       | F:\Programme\mysql\share\charsets/ |
| character_set_results    | latin1                             |
+--------------------------+------------------------------------+
7 rows in set (0.03 sec)

mysql> SHOW VARIABLES LIKE 'col%';
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database   | latin1_swedish_ci |
| collation_server     | latin1_swedish_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)

mysql> SELECT * FROM page WHERE LOWER(pagename) LIKE 'test%';
Empty set (0.11 sec)

mysql> SELECT * FROM page;
+----+-------------+
| id | pagename    |
+----+-------------+
|  1 | TestString  |
|  2 | TestString2 |
|  3 | TestString1 |
|  4 | TestString3 |
+----+-------------+
4 rows in set (0.00 sec)
[20 Jul 2004 1:25] Matthew Lord
This is windows specific.  I could repeat it on win2k but not on linux.
[20 Jul 2004 5:30] Paul DuBois
You might want to take a look here:

http://dev.mysql.com/doc/mysql/en/Case_Sensitivity_Operators.html

You're not using a BLOB column, but your pagename column has
a character set of "binary". You might want to convert it to the latin1
character set with a collation of latin1_bin.

mysql> alter table page modify pagename varchar(100) collate latin1_bin;
mysql> SELECT * FROM page WHERE LOWER(pagename) LIKE 'test%';
+----+------------+
| id | pagename   |
+----+------------+
|  1 | TestString |
+----+------------+
[20 Jul 2004 6:34] Paul DuBois
Okay, after further investigation, I can partially reproduce the original
results. I believe the behavior observed will occur when VARCHAR BINARY
is created with the binary character set.  This will happen in MySQL 4.1 up
to 4.1.1 and in MySQL 5.0.0. After that, a change was made such that
VARCHAR BINARY is created using the binary collation of the column
character set. For example, if the default character set is latin1, the
pagename column will be created with this definition:

pagename varchar(100) character set latin1 collate latin1_bin NOT NULL default

When the column is created like this (which is the current behavior),
UPPER() and LOWER() will work as you want.

Something I find odd about the original report is that it says the undesired
behavior occurs in MySQL 4.1.3. I cannot reproduce that.  Can you try
the test again with 4.1.3 and verify what the table structure looks like
by using SHOW CREATE TABLE?
[20 Jul 2004 10:15] Reini Urban
I only tested 4.1.3-beta so far. Unfortunately I'm until August, 1 in Poland, far away from my system, so I cannot verify it for the 4.1.3 release version. 
But if you could have verified it it must be okay.
I'll fix my app (phpwiki.sf.net) in this way.

Maybe an entry somewhere in the documentation might help for this new behaviour.
[20 Jul 2004 16:22] Matthew Lord
OK, I see what the difference is.  I cut and pasted his create table
statement and select statement in the text client on w2k and linux.
The select statement on linux was coming back with the row and
the one on w2k was not.  The reason is this:

On linux this is the show create table output:

CREATE TABLE `page` (
  `id` int(11) NOT NULL auto_increment,
  `pagename` varchar(100) character set latin1 collate latin1_bin NOT NULL default '',
  PRIMARY KEY  (`id`),
  UNIQUE KEY `pagename` (`pagename`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

And on w2k:

 CREATE TABLE `page` (
  `id` int(11) NOT NULL auto_increment,
  `pagename` varbinary(100) NOT NULL default '',
  PRIMARY KEY  (`id`),
  UNIQUE KEY `pagename` (`pagename`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

This does seem like a bug to me.  The behavior should
at least be the same.

Best Regards,
[23 Jul 2004 0:17] Matthew Lord
I cannot repeat this in 4.1.3b on windows or 4.1.3 on linux or OS X.  I apologize, I was in the 
wrong version when I originally verified this.