Bug #44228 | Unable to insert UTF string into varchar column | ||
---|---|---|---|
Submitted: | 13 Apr 2009 0:35 | Modified: | 18 Apr 2009 20:51 |
Reporter: | Noah | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | Connector / NET | Severity: | S2 (Serious) |
Version: | 5.2.5 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[13 Apr 2009 0:35]
Noah
[13 Apr 2009 0:38]
Noah
Incorrectly quoted strings in the steps to reproduce section, please use the below instead: Set all the parameter values as follows: dbCmd.Parameters["?sid"].Value = 1; dbCmd.Parameters["?messageDT"].Value = "2009-04-08 18:45:00"; dbCmd.Parameters["?messageType"].Value = "Generic"; dbCmd.Parameters["?userName"].Value = "-₪EPIC₪- Trickster |CiC.|"; dbCmd.Parameters["?userSteam"].Value = "somethingHere"; dbCmd.Parameters["?userTeam"].Value = "somethingHere"; dbCmd.Parameters["?targetName"].Value = "somethingHere"; dbCmd.Parameters["?targetSteam"].Value = "somethingHere"; dbCmd.Parameters["?targetTeam"].Value = "somethingHere"; dbCmd.Parameters["?logLine"].Value = "somethingHere";
[13 Apr 2009 20:49]
Noah
Parameters were set incorrectly in the comment above. Please use the below parameters. dbCmd.Parameters["?sid"].Value = 1; dbCmd.Parameters["?messageDT"].Value = "2009-04-08 18:45:00"; dbCmd.Parameters["?messageType"].Value = "Generic"; dbCmd.Parameters["?userName"].Value = "somethingHere"; dbCmd.Parameters["?userSteam"].Value = "somethingHere"; dbCmd.Parameters["?userTeam"].Value = "somethingHere"; dbCmd.Parameters["?targetName"].Value = "-₪EPIC₪- Trickster |CiC.|"; dbCmd.Parameters["?targetSteam"].Value = "somethingHere"; dbCmd.Parameters["?targetTeam"].Value = "somethingHere"; dbCmd.Parameters["?logLine"].Value = "somethingHere";
[14 Apr 2009 6:59]
Tonci Grgin
Hi Noah and thanks for your report. This is actually misuse of charsets, not a bug. So please paste "SHOW CREATE TABLE" output for table in question and for the SP. Mind you, your database is latin1. It would be nice to see output of "SHOW VARIABLES LIKE "%char%" too. According to the manual (http://dev.mysql.com/doc/refman/5.0/en/create-procedure.html): "For character data types, if there is a CHARACTER SET clause in the declaration, the specified character set and its default collation are used. If there is no such clause, the database character set and collation that are in effect at the time the routine is created are used. (These are given by the values of the character_set_database and collation_database system variables.) The COLLATE attribute is not supported (this includes use of BINARY, because in this context BINARY specifies the binary collation of the character set.)." > Incorrect string value: '\xE2\x82\xAAEPI...' for column 'targetName' at row 1 This actually means that you're trying to write latin1 chars as utf8 stream. You can try slapping charset introducer (_utf8) in front of that string to indicate that you're sending chars in different charset than connection default but this will add to mess you already have. You can check default charset used by looking into general query log on the MySQL server from the moment when you start your test. Should look something like this: 090305 10:06:43 58 Connect root@QCW2K8.lan on test 58 Query SET NAMES utf8 58 Query SET character_set_results = NULL 58 Query SET SQL_AUTO_IS_NULL = 0 I presume line SET NAMES utf8 reads "SET NAMES latin1" in your case (as table column is latin1) and/or character_set_results is set to something other than NULL preventing auto-detection of charset to be used.
[16 Apr 2009 23:53]
Noah
Tonci, Thank you for your reply. The reason I originally assumed this was a bug is because I do NOT get the error with MySQL 5.0.x. The application runs flawlessly on MySQL 5.0.x with the same data. As soon as I tried the application on MySQL 5.1.x, I got the error. I am not using a SP, I am using a text query. If the table is setup to be utf8/utf8_general_ci, why would the schema override that? I thought it was the other way around. I have my connection explicitly setup to use the utf8 charset set (charset=utf8 in the connection string), and I am inserting into a utf8 charset table. Is that wrong or? Here is the information you requested: Result set from SHOW CREATE TABLE tbllogmessages Table, Create Table 'tbllogmessages', 'CREATE TABLE `tbllogmessages` ( `lid` int(11) NOT NULL AUTO_INCREMENT, `sid` int(11) NOT NULL, `messageDT` datetime NOT NULL, `messageType` varchar(45) CHARACTER SET latin1 NOT NULL DEFAULT 'Generic', `userName` varchar(45) CHARACTER SET latin1 DEFAULT NULL, `userSteam` varchar(45) CHARACTER SET latin1 DEFAULT NULL, `userTeam` varchar(45) CHARACTER SET latin1 DEFAULT NULL, `targetName` varchar(45) CHARACTER SET latin1 DEFAULT NULL, `targetSteam` varchar(45) CHARACTER SET latin1 DEFAULT NULL, `targetTeam` varchar(45) CHARACTER SET latin1 DEFAULT NULL, `logLine` text CHARACTER SET latin1 NOT NULL, PRIMARY KEY (`lid`), KEY `FK_LogMessages_Sources` (`sid`), CONSTRAINT `FK_LogMessages_Sources` FOREIGN KEY (`sid`) REFERENCES `tblsources` (`sid`) ON DELETE NO ACTION ON UPDATE NO ACTION ) ENGINE=InnoDB AUTO_INCREMENT=590 DEFAULT CHARSET=utf8' Result set from SHOW VARIABLES LIKE "%char%" Variable name, Value 'character_set_client', 'utf8' 'character_set_connection', 'utf8' 'character_set_database', 'latin1' 'character_set_filesystem', 'binary' 'character_set_results', 'utf8' 'character_set_server', 'utf8' 'character_set_system', 'utf8' 'character_sets_dir', 'C:\pathRemoved\MySQL Server 5.1\share\charsets\' Query log: 090417 0:42:23 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 42 Connect slrs@localhost on sourceloggingdaemon 42 Query SHOW VARIABLES 42 Query SHOW COLLATION 42 Query SET NAMES utf8;SET character_set_results=NULL 42 Init DB sourceloggingdaemon 42 Query SELECT `sid`, `ip`, `port` FROM `sourceloggingdaemon`.`tblsources` 42 Init DB sourceloggingdaemon 090417 0:42:24 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:25 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:26 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:27 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:28 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:29 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:30 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:31 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:32 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:33 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:34 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:35 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:36 42 Query INSERT INTO tblLogMessages(sid, messageDT, messageType, userName, userSteam, userTeam, targetName, targetSteam, targetTeam, logLine) VALUES(1, '2009-04-17 00:42:36' , 'User', 'Console', 'Console', 'Console', NULL, NULL, NULL ,'\"Console<0><Console><Console>\" say \"test\"') 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS 090417 0:42:37 3 Query SHOW GLOBAL STATUS 3 Query SHOW INNODB STATUS
[17 Apr 2009 7:34]
Tonci Grgin
Noah, thanks, just what I needed to see: 'character_set_database', 'latin1' and 42 Query SET NAMES utf8;SET character_set_results=NULL Let me see what I can dig out on this.
[17 Apr 2009 7:57]
Tonci Grgin
Noah, in my opinion, this is *not* a bug and here's why: 'character_set_database' is latin1 which I think happened because you created database and tables with default server settings (character_set_server was latin1). Then you changed character_set_server to UTF8 to match your 5.0 server environment. But this is all not an explanation, just a reconstruction. You problem lies in fact that you mixed latin1/utf8 on table level: ... `logLine` text CHARACTER SET latin1 NOT NULL, ... DEFAULT CHARSET=utf8' Actually, your utf8 stream is *forced* into latin1 bytes so you have two problems here: o Data that did make it from UTF8 stream into latin1 bytes is probably corrupted in larger part. o You can not force characters outside bitmap into latin1 column (which is why you posted the report). Now, you should either remove charset specifiers from columns ( `userTeam` varchar(45) *CHARACTER SET latin1* DEFAULT NULL,) so that table level definition (utf8) kicks in or you should add *CHARACTER SET UTF8* to columns definition. What I would do is drop the database in question, recreate it with default charset being utf8, recreate tables *without* latin1 in column definitions. I am convinced it will work then. You can also recheck your schema on MySQL 5.0 server as it's bound to differ from what I see here.
[18 Apr 2009 20:42]
Noah
Tonci, I think you're right :) I originally created the DB table with the latin1 charset. I ran a update script that I generated with MySQL Workbench after I switched the table to UTF8. The update script didn't do what I thought it would do. Thanks for your help.
[18 Apr 2009 20:51]
Noah
I just figured it out. When I was running the SQL sync with MySQL WB, I was using a SQL file where I already modified the character set, so all it did was change the collation. Since I used that SQL script to update my prod DB, I got the error that I posted. Thanks again for your help.