Bug #44228 Unable to insert UTF string into varchar column
Submitted: 13 Apr 2009 0:35 Modified: 18 Apr 2009 20:51
Reporter: Noah Email Updates:
Status: Not a Bug Impact on me:
None 
Category:Connector / NET Severity:S2 (Serious)
Version:5.2.5 OS:Any
Assigned to: CPU Architecture:Any

[13 Apr 2009 0:35] Noah
Description:
If an C# .Net 2.0 application tries to insert a string with UTF8 characters into a varchar(45) column, the following exception occurs:

Exception Message: Incorrect string value: '\xE2\x82\xAAEPI...' for column 'targetName' at row 1

Stack Trace:   at MySql.Data.MySqlClient.MySqlStream.OpenPacket()
   at MySql.Data.MySqlClient.NativeDriver.ReadResult(UInt64& affectedRows, Int64& lastInsertId)
   at MySql.Data.MySqlClient.MySqlDataReader.GetResultSet()
   at MySql.Data.MySqlClient.MySqlDataReader.NextResult()
   at MySql.Data.MySqlClient.MySqlCommand.ExecuteReader(CommandBehavior behavior)
   at MySql.Data.MySqlClient.MySqlCommand.ExecuteNonQuery()
   at SourceLoggingDaemon.Program.Main(String[] args)

The string that I attempted to insert: -₪EPIC₪- Trickster |CiC.|

This error does not occur with 5.0.x (exact version 5.0.51a) MySQL Servers. I think the handling of UTF encoded strings changed from MySQL Server 5.0.x to 5.1.x (exact version 5.1.32-community), and that the connector doesn't yet support the new system.

How to repeat:
Software versions:

MySQL Net Connector: 5.2.5
MySQL Server: 5.1.32-community
Microsoft .NET Framework 2.0
Microsoft Visual Studio 2008 Pro

Create the following DB schema:

SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0;
SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;
SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='TRADITIONAL';

CREATE SCHEMA IF NOT EXISTS `sourceloggingdaemon` DEFAULT CHARACTER SET latin1 COLLATE latin1_swedish_ci ;
USE `sourceloggingdaemon`;

-- -----------------------------------------------------
-- Table `sourceloggingdaemon`.`tblSources`
-- -----------------------------------------------------
CREATE  TABLE IF NOT EXISTS `sourceloggingdaemon`.`tblSources` (
  `sid` INT NOT NULL AUTO_INCREMENT ,
  `ip` VARCHAR(15) NOT NULL ,
  `port` VARCHAR(6) NOT NULL ,
  PRIMARY KEY (`sid`) )
ENGINE = InnoDB;

-- -----------------------------------------------------
-- Table `sourceloggingdaemon`.`tblLogMessages`
-- -----------------------------------------------------
CREATE  TABLE IF NOT EXISTS `sourceloggingdaemon`.`tblLogMessages` (
  `lid` INT NOT NULL AUTO_INCREMENT ,
  `sid` INT NOT NULL ,
  `messageDT` DATETIME NOT NULL ,
  `messageType` VARCHAR(45) NOT NULL DEFAULT 'Generic' ,
  `userName` VARCHAR(45) NULL ,
  `userSteam` VARCHAR(45) NULL ,
  `userTeam` VARCHAR(45) NULL ,
  `targetName` VARCHAR(45) NULL ,
  `targetSteam` VARCHAR(45) NULL ,
  `targetTeam` VARCHAR(45) NULL ,
  `logLine` TEXT NOT NULL ,
  PRIMARY KEY (`lid`) ,
  INDEX `FK_LogMessages_Sources` (`sid` ASC) ,
  CONSTRAINT `FK_LogMessages_Sources`
    FOREIGN KEY (`sid` )
    REFERENCES `sourceloggingdaemon`.`tblSources` (`sid` )
    ON DELETE NO ACTION
    ON UPDATE NO ACTION)
ENGINE = InnoDB
DEFAULT CHARACTER SET = utf8
COLLATE = utf8_general_ci;

SET SQL_MODE=@OLD_SQL_MODE;
SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;
SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS;

Execute the following statement:

INSERT INTO tblSources (ip, port) VALUES ('127.0.0.1', '300')

From a C# application, using the Net connector, follow the below steps:

Create a MySqlConnection using the below connection string:

server=localhost;user id=slrstestuser;Password=passwordHere;persist security info=True;database=sourceloggingdaemon;charset=utf8

Create a MySqlCommand with the below query and associate with the above MySQL connection.

INSERT INTO tblLogMessages(sid, messageDT, messageType, userName, userSteam, userTeam, targetName, targetSteam, targetTeam, logLine) VALUES(?sid, ?messageDT , ?messageType, ?userName, ?userSteam, ?userTeam, ?targetName, ?targetSteam, ?targetTeam ,?logLine)

Bind all of the parameters as follows:

            dbCmd.Parameters.Add("?sid", MySqlDbType.Int32);
            dbCmd.Parameters.Add("?messageDT", MySqlDbType.DateTime);
            dbCmd.Parameters.Add("?messageType", MySqlDbType.VarChar);
            dbCmd.Parameters.Add("?userName", MySqlDbType.VarChar);
            dbCmd.Parameters.Add("?userSteam", MySqlDbType.VarChar);
            dbCmd.Parameters.Add("?userTeam", MySqlDbType.VarChar);
            dbCmd.Parameters.Add("?targetName", MySqlDbType.VarChar);
            dbCmd.Parameters.Add("?targetSteam", MySqlDbType.VarChar);
            dbCmd.Parameters.Add("?targetTeam", MySqlDbType.VarChar);
            dbCmd.Parameters.Add("?logLine", MySqlDbType.Text);

Set all the parameter values as follows:

            dbCmd.Parameters["?sid"].Value = 1;
            dbCmd.Parameters["?messageDT"].Value = '2009-04-08 18:45:00';
            dbCmd.Parameters["?messageType"].Value = 'Generic';
            dbCmd.Parameters["?userName"].Value = '-₪EPIC₪- Trickster |CiC.|';
            dbCmd.Parameters["?userSteam"].Value = 'somethingHere';
            dbCmd.Parameters["?userTeam"].Value = 'somethingHere';
            dbCmd.Parameters["?targetName"].Value = 'somethingHere';
            dbCmd.Parameters["?targetSteam"].Value = 'somethingHere';
            dbCmd.Parameters["?targetTeam"].Value = 'somethingHere';
            dbCmd.Parameters["?logLine"].Value = 'somethingHere';

Execute the query:

dbCmd.ExecuteNonQuery();

The exception should occur at this point.
[13 Apr 2009 0:38] Noah
Incorrectly quoted strings in the steps to reproduce section, please use the below instead:

Set all the parameter values as follows:

            dbCmd.Parameters["?sid"].Value = 1;
            dbCmd.Parameters["?messageDT"].Value = "2009-04-08 18:45:00";
            dbCmd.Parameters["?messageType"].Value = "Generic";
            dbCmd.Parameters["?userName"].Value = "-₪EPIC₪- Trickster |CiC.|";
            dbCmd.Parameters["?userSteam"].Value = "somethingHere";
            dbCmd.Parameters["?userTeam"].Value = "somethingHere";
            dbCmd.Parameters["?targetName"].Value = "somethingHere";
            dbCmd.Parameters["?targetSteam"].Value = "somethingHere";
            dbCmd.Parameters["?targetTeam"].Value = "somethingHere";
            dbCmd.Parameters["?logLine"].Value = "somethingHere";
[13 Apr 2009 20:49] Noah
Parameters were set incorrectly in the comment above. Please use the below parameters.

            dbCmd.Parameters["?sid"].Value = 1;
            dbCmd.Parameters["?messageDT"].Value = "2009-04-08 18:45:00";
            dbCmd.Parameters["?messageType"].Value = "Generic";
            dbCmd.Parameters["?userName"].Value = "somethingHere";
            dbCmd.Parameters["?userSteam"].Value = "somethingHere";
            dbCmd.Parameters["?userTeam"].Value = "somethingHere";
            dbCmd.Parameters["?targetName"].Value = "-₪EPIC₪- Trickster |CiC.|";
            dbCmd.Parameters["?targetSteam"].Value = "somethingHere";
            dbCmd.Parameters["?targetTeam"].Value = "somethingHere";
            dbCmd.Parameters["?logLine"].Value = "somethingHere";
[14 Apr 2009 6:59] Tonci Grgin
Hi Noah and thanks for your report.

This is actually misuse of charsets, not a bug. So please paste "SHOW CREATE TABLE" output for table in question and for the SP. Mind you, your database is latin1.
It would be nice to see output of "SHOW VARIABLES LIKE "%char%" too.

According to the manual (http://dev.mysql.com/doc/refman/5.0/en/create-procedure.html):

"For character data types, if there is a CHARACTER SET clause in the declaration, the specified character set and its default collation are used. If there is no such clause, the database character set and collation that are in effect at the time the routine is created are used. (These are given by the values of the character_set_database and collation_database system variables.) The COLLATE attribute is not supported (this includes use of BINARY, because in this context BINARY specifies the binary collation of the character set.)."

> Incorrect string value: '\xE2\x82\xAAEPI...' for column 'targetName' at row 1

This actually means that you're trying to write latin1 chars as utf8 stream. You can try slapping charset introducer (_utf8) in front of that string to indicate that you're sending chars in different charset than connection default but this will add to mess you already have. You can check default charset used by looking into general query log on the MySQL server from the moment when you start your test. Should look something like this:
090305 10:06:43   58 Connect	root@QCW2K8.lan on test
		   58 Query	SET NAMES utf8
		   58 Query	SET character_set_results = NULL
		   58 Query	SET SQL_AUTO_IS_NULL = 0

I presume line SET NAMES utf8 reads "SET NAMES latin1" in your case (as table column is latin1) and/or character_set_results is set to something other than NULL preventing auto-detection of charset to be used.
[16 Apr 2009 23:53] Noah
Tonci,

Thank you for your reply.

The reason I originally assumed this was a bug is because I do NOT get the error with MySQL 5.0.x.

The application runs flawlessly on MySQL 5.0.x with the same data.

As soon as I tried the application on MySQL 5.1.x, I got the error.

I am not using a SP, I am using a text query.

If the table is setup to be utf8/utf8_general_ci, why would the schema override that? I thought it was the other way around.

I have my connection explicitly setup to use the utf8 charset set (charset=utf8 in the connection string), and I am inserting into a utf8 charset table. Is that wrong or?

Here is the information you requested:

Result set from SHOW CREATE TABLE tbllogmessages

Table, Create Table

'tbllogmessages',

'CREATE TABLE `tbllogmessages` (
  `lid` int(11) NOT NULL AUTO_INCREMENT,
  `sid` int(11) NOT NULL,
  `messageDT` datetime NOT NULL,
  `messageType` varchar(45) CHARACTER SET latin1 NOT NULL DEFAULT 'Generic',
  `userName` varchar(45) CHARACTER SET latin1 DEFAULT NULL,
  `userSteam` varchar(45) CHARACTER SET latin1 DEFAULT NULL,
  `userTeam` varchar(45) CHARACTER SET latin1 DEFAULT NULL,
  `targetName` varchar(45) CHARACTER SET latin1 DEFAULT NULL,
  `targetSteam` varchar(45) CHARACTER SET latin1 DEFAULT NULL,
  `targetTeam` varchar(45) CHARACTER SET latin1 DEFAULT NULL,
  `logLine` text CHARACTER SET latin1 NOT NULL,
  PRIMARY KEY (`lid`),
  KEY `FK_LogMessages_Sources` (`sid`),
  CONSTRAINT `FK_LogMessages_Sources` FOREIGN KEY (`sid`) REFERENCES `tblsources` (`sid`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=590 DEFAULT CHARSET=utf8'

Result set from SHOW VARIABLES LIKE "%char%"

Variable name, Value

'character_set_client', 'utf8'
'character_set_connection', 'utf8'
'character_set_database', 'latin1'
'character_set_filesystem', 'binary'
'character_set_results', 'utf8'
'character_set_server', 'utf8'
'character_set_system', 'utf8'
'character_sets_dir', 'C:\pathRemoved\MySQL Server 5.1\share\charsets\'

Query log:

090417  0:42:23    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
		   42 Connect	slrs@localhost on sourceloggingdaemon
		   42 Query	SHOW VARIABLES
		   42 Query	SHOW COLLATION
		   42 Query	SET NAMES utf8;SET character_set_results=NULL
		   42 Init DB	sourceloggingdaemon
		   42 Query	SELECT `sid`, `ip`, `port` FROM `sourceloggingdaemon`.`tblsources`
		   42 Init DB	sourceloggingdaemon
090417  0:42:24    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:25    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:26    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:27    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:28    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:29    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:30    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:31    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:32    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:33    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:34    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:35    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:36   42 Query	INSERT INTO tblLogMessages(sid, messageDT, messageType, userName, userSteam, userTeam, targetName, targetSteam, targetTeam, logLine) VALUES(1, '2009-04-17  00:42:36' , 'User', 'Console', 'Console', 'Console', NULL, NULL, NULL ,'\"Console<0><Console><Console>\" say \"test\"')
		    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
090417  0:42:37    3 Query	SHOW GLOBAL STATUS
		    3 Query	SHOW INNODB STATUS
[17 Apr 2009 7:34] Tonci Grgin
Noah, thanks, just what I needed to see:

	'character_set_database', 'latin1'
and
	42 Query	SET NAMES utf8;SET character_set_results=NULL
Let me see what I can dig out on this.
[17 Apr 2009 7:57] Tonci Grgin
Noah, in my opinion, this is *not* a bug and here's why:

'character_set_database' is latin1 which I think happened because you created database and tables with default server settings (character_set_server was latin1). Then you changed character_set_server to UTF8 to match your 5.0 server environment. But this is all not an explanation, just a reconstruction. You problem lies in fact that you mixed latin1/utf8 on table level:
...
  `logLine` text CHARACTER SET latin1 NOT NULL,
...
DEFAULT CHARSET=utf8'

Actually, your utf8 stream is *forced* into latin1 bytes so you have two problems here:
  o Data that did make it from UTF8 stream into latin1 bytes is probably corrupted in larger part.
  o You can not force characters outside bitmap into latin1 column (which is why you posted the report).

Now, you should either remove charset specifiers from columns (  `userTeam` varchar(45) *CHARACTER SET latin1* DEFAULT NULL,) so that table level definition (utf8) kicks in or you should add *CHARACTER SET UTF8* to columns definition.
What I would do is drop the database in question, recreate it with default charset being utf8, recreate tables *without* latin1 in column definitions. I am convinced it will work then.

You can also recheck your schema on MySQL 5.0 server as it's bound to differ from what I see here.
[18 Apr 2009 20:42] Noah
Tonci,

I think you're right :)

I originally created the DB table with the latin1 charset.

I ran a update script that I generated with MySQL Workbench after I switched the table to UTF8. The update script didn't do what I thought it would do.

Thanks for your help.
[18 Apr 2009 20:51] Noah
I just figured it out.

When I was running the SQL sync with MySQL WB, I was using a SQL file where I already modified the character set, so all it did was change the collation.

Since I used that SQL script to update my prod DB, I got the error that I posted.

Thanks again for your help.