Bug #31395 equality match in a where clause against an utf8 field fails
Submitted: 4 Oct 2007 12:02 Modified: 8 Oct 2007 7:31
Reporter: Frank Dr. Ullrich Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Charsets Severity:S2 (Serious)
Version:4.1.23-pro-gpl-log OS:Solaris (version 10)
Assigned to: CPU Architecture:Any
Tags: utf8 string equality match

[4 Oct 2007 12:02] Frank Dr. Ullrich
Description:
In an utf8 table mysql fails to update a row because it doesn't find that row by an equality match that works in a select statement!
Even if you perform a "set names utf8;" the update fails (doesn't find the requested row).
On the other hand, if you use a "like" clause match then the update works.

This problem is not present in version 4.1.22-standard, for example.

How to repeat:
CREATE TABLE `nl_user` (
  `user_id` int(11) NOT NULL auto_increment,
  `email` varchar(255) collate latin1_german1_ci default NULL,
  `suspended_since` datetime default NULL,
  PRIMARY KEY  (`user_id`),
  UNIQUE KEY `email` (`email`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci
;
insert into nl_user values (0,'olduser@heise.de',now());
select * from nl_user where email='olduser@heise.de';
+---------+------------------+---------------------+
| user_id | email            | suspended_since     |
+---------+------------------+---------------------+
|       1 | olduser@heise.de | 2007-10-04 12:32:55 |
+---------+------------------+---------------------+
1 row in set (0.00 sec)

update nl_user set suspended_since=now() where email='olduser@heise.de';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 0  Changed: 0  Warnings: 0
update nl_user set suspended_since=now() where email like 'olduser@heise.de';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
[5 Oct 2007 7:31] Frank Dr. Ullrich
Sorry that I provided a wrong create table statement.
The correct one is:

CREATE TABLE `nl_user` (
  `user_id` int(11) NOT NULL auto_increment,
  `email` varchar(255) NOT NULL default '',
  `suspended_since` datetime default NULL,
  PRIMARY KEY  (`user_id`),
  UNIQUE KEY `email` (`email`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
;

The error occured because in the mean time in production we changed the table character set to latin1 so we have a workaround here.
Nevertheless we need the table converted back to utf8 as we are about to switch several large applications towards utf8! So a bug fix is required.
[6 Oct 2007 14:37] MySQL Verification Team
this is a duplicate of Bug #28878
[8 Oct 2007 7:31] Frank Dr. Ullrich
This is not quite a duplicate of bug #28878 because the latter one deals with CHAR fields whereas our problem relates to a VARCHAR(255) field!
[8 Oct 2007 13:14] Heikki Tuuri
The patch to Bug #28878 fixes also this bug because in MySQL-4.1, a 'VARCHAR(n)' is internally actually a CHAR(n) field. MySQL-4.1 simulates a VARCHAR by truncating the trailing spaces from the field.