Bug #30131 "Incorrect string value" error (1366) when inserting special characters
Submitted: 30 Jul 2007 21:11 Modified: 9 Feb 2010 14:18
Reporter: Andreas E. Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Command-line Clients Severity:S3 (Non-critical)
Version:5.0.45 OS:Microsoft Windows (Server 2003 SP 2)
Assigned to: CPU Architecture:Any
Tags: 1366, client, data, import, incorrect, infile, load, mysql.exe, string, truncated, truncation, umlaut, umlauts, Unicode, utf8, value

[30 Jul 2007 21:11] Andreas E.
Description:
I have read a lot of reports about *similar* issues, but not this one.
The error message "Data too long for column" is obsolete; now you get a similar error message when trying to insert german word "Vögel" (birds): Incorrect string value '\x94gel ...'

Also refer to 16209 which does not output the same error message, but the truncation of records is exactly the same as with the current version.

How to repeat:
Here is a step-by-step walkthrough how to reproduce the issue with the mysql command-line client.

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 25
Server version: 5.0.45-community-nt MySQL Community Edition (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> set names utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> create database spiel0;
Query OK, 1 row affected (0.00 sec)

mysql> use spiel0
Database changed
mysql> create table umlauttest (ID bigint(20) not null auto_increment primary key, term varchar(255));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into umlauttest values (1, 'Vögel');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+--------------------------------------------------------------+
| Level   | Code | Message                                                      |
+---------+------+--------------------------------------------------------------+
| Warning | 1366 | Incorrect string value: '\x94gel' for column 'term' at row 1 |
+---------+------+--------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select * from umlauttest;
+----+------+
| ID | term |
+----+------+
|  1 | V    |
+----+------+
1 row in set (0.00 sec)

mysql> status
--------------
mysql  Ver 14.12 Distrib 5.0.45, for Win32 (ia32)

Connection id:          25
Current database:       spiel0
Current user:           root@localhost
SSL:                    Not in use
Using delimiter:        ;
Server version:         5.0.45-community-nt MySQL Community Edition (GPL)
Protocol version:       10
Connection:             localhost via TCP/IP
Insert id:              1
Server characterset:    utf8
Db     characterset:    utf8
Client characterset:    utf8
Conn.  characterset:    utf8
TCP port:               3306
Uptime:                 32 min 33 sec

Threads: 3  Questions: 3185  Slow queries: 0  Opens: 22  Flush tables: 2  Open tables: 3  Queries per second avg: 1
.631
--------------

mysql> show variables like 'char%';
+--------------------------+---------------------------------------------+
| Variable_name            | Value                                       |
+--------------------------+---------------------------------------------+
| character_set_client     | utf8                                        |
| character_set_connection | utf8                                        |
| character_set_database   | utf8                                        |
| character_set_filesystem | binary                                      |
| character_set_results    | utf8                                        |
| character_set_server     | utf8                                        |
| character_set_system     | utf8                                        |
| character_sets_dir       | T:\Tools\Productivity\MySQL\share\charsets\ |
+--------------------------+---------------------------------------------+
8 rows in set (0.00 sec)

mysql>
[30 Jul 2007 21:17] Andreas E.
(from my.ini)

[mysql]

default-character-set=utf8

[mysqld]

# The TCP/IP Port the MySQL Server will listen on
port=3306

default-character-set=utf8
character_set_server=utf8
[30 Jul 2007 21:24] Sveta Smirnova
Thank you for the report.

Please p;rovide output of command `chcp` running in cmd.exe
[1 Aug 2007 20:32] Andreas E.
Active code page: 850
[1 Aug 2007 20:49] Sveta Smirnova
Thank you for the feedback.

You really insert characters in latin1 encoding. (See Microsoft code page description at http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/chcp.mspx?m...)

So this is not a MySQL bug. Use appropriate code page or MySQL Query Browser and text files if you want to use UTF8.
[5 Nov 2007 6:13] Bipin Kadam
First of all, does by changing code page anyone got solution.
I got same problem as explained as "Incorrect string value: '\xEF\xBF\xBD ex...' for column 'Description' at row 2"
and My description column is ="Centrally located, the Stratosphere Casino Hotel and Tower is at the top of the Las Vegas Strip and the closest Strip property to Downtown Las Vegas and the Fremont Street Experience.  Please note: Stratosphere Casino, Hotel & Tower has implemented a Resort Fee of $5.00 per room, per night. The new Resort Fee will offer guests added values which will enhance their stay at the Stratosphere, including:   Free daily admission to the fitness center and 8th floor resort style pool which features the longest pool hours and friendliest fun spot in Las Vegas, 2 for the price of 1 American Superstars Tickets, Showroom Dark on Thursday, 2 for the price of 1 Bite Tickets, Showroom Dark on Thursday, 2 Viva Las Vegas tickets, afternoon show at 2pm and 4pm – excludes Sunday when the showroom is dark (admission only, no drinks included), Free admission to the Tower Viewing Deck (10am – 2pm), Free use of laundry and drying facilities on 6th floor (does not include soap and other laundry products available separately I vending machines), Welcome book value of $150 for use throughout the Stratosphere Casino, Hotel & Tower, Guaranteed guest valet parking".
When I open inserts script file in textpad i got warning as "WARNING: "Inserts.sql" contains characters that do not exist in code page 1252  (ANSI - Latin I). They will be converted to the system default character, if you click OK." 
Note that-> When I insert row using query manually it will work. 
Can anyone please help me on this. How to solve this problem?
Also from mysql client I got this:
"mysql> show variables like 'char%';
+--------------------------+----------------------------------------------------
-----+
| Variable_name            | Value
     |
+--------------------------+----------------------------------------------------
-----+
| character_set_client     | latin1
     |
| character_set_connection | latin1
     |
| character_set_database   | latin1
     |
| character_set_filesystem | binary
     |
| character_set_results    | latin1
     |
| character_set_server     | latin1
     |
| character_set_system     | utf8
     |
| character_sets_dir       | C:\Program Files\MySQL\MySQL Server 5.0\share\chars
ets\ |
"
[6 Dec 2007 19:30] a b
I do need to store characters that are not in my code page 437. In order to do that I use concat + char in my Java program:
mysql> insert into CUSTOMERS values (CONCAT(CHAR(65),CHAR(118),CHAR(105),CHAR(97),CHAR(100),CHAR(32),CHAR(66),CHAR(114),CHAR(111),CHAR(115),CHAR(104),CHAR(105)),CONCAT(CHAR(97),CHAR(118),CHAR(105),CHAR(97),CHAR(100),CHAR(64),CHAR(122),CHAR(122),CHAR(46),CHAR(99),CHAR(111),CHAR(109)),'3',CONCAT(CHAR(55),CHAR(48),CHAR(32),CHAR(83),CHAR(101),CHAR(99),CHAR(111),CHAR(114),CHAR(32),CHAR(82),CHAR(111),CHAR(97),CHAR(100)),null,CONCAT(CHAR(83),CHAR(99),CHAR(97),CHAR(114),CHAR(115),CHAR(100),CHAR(97),CHAR(108),CHAR(101)),CONCAT(CHAR(78),CHAR(89)),CONCAT(CHAR(85),CHAR(83),CHAR(32),CHAR(85),CHAR(110),CHAR(105),CHAR(116),CHAR(101),CHAR(100),CHAR(32),CHAR(83),CHAR(116),CHAR(97),CHAR(116),CHAR(101),CHAR(115)),CONCAT(CHAR(49),CHAR(48),CHAR(53),CHAR(56),CHAR(51),CHAR(32)),CONCAT(CHAR(40),CHAR(57),CHAR(49),CHAR(52),CHAR(41),CHAR(32),CHAR(52),CHAR(55),CHAR(50),CHAR(45),CHAR(55),CHAR(56),CHAR(57),CHAR(54)),CONCAT(CHAR(143),CHAR(172),CHAR(149),CHAR(233),CHAR(140),CHAR(246),CHAR(139),CHAR(103)),'N');
This technique works just fine in Oracle. mySQL reports:
ERROR 1366 (HY000): Incorrect string value: '\x8F\xAC\x95\xE9\x8C\xF6...' for column ...
I believe the bug has been closed prematurely.
BTW:
mysql> status
--------------
C:\Program Files\MySQL\MySQL Server 5.1\bin\mysql.exe  Ver 14.14 Distrib 5.1.22-rc, for Win32 (ia32)
Connection id:          89
Current database:       custdata
Current user:           cust@localhost
SSL:                    Not in use
Using delimiter:        ;
Server version:         5.1.22-rc-community MySQL Community Server (GPL)
Protocol version:       10
Connection:             localhost via TCP/IP
Server characterset:    utf8
Db     characterset:    utf8
Client characterset:    utf8
Conn.  characterset:    utf8
TCP port:               3306
Uptime:                 3 days 22 hours 26 min 23 sec
Threads: 1  Questions: 546  Slow queries: 0  Opens: 69  Flush tables: 1  Open tables: 0  Queries per second avg: 0.1
[8 Feb 2010 15:39] Andreas E.
You can say that again.

Almost 3 years later I'm coming back here, and nothing works from the console!! Sveta, in all honesty:
I don't think it's justified to say "Use an external tool to insert your data if you want to use UTF8".
It *MUST* be working from the mysql.exe console. And if it doesn't, it may maybe not called a bug but at least there should be a way to force UTF8 input and also inserting (test) data instead of getting this error.

But I have to repeat again: once it comes to special country characters, MySQL does not care, claiming "everything works alright". (Which is true, because for en_US/en_UK, you can't run into any trouble.)
It was like this back in 2003, and it's quite the same to this day.
[8 Feb 2010 16:20] Tonci Grgin
On my OpenSolaris box:
create table umlauttest (ID bigint(20) not null auto_increment primary key, term varchar(255));
insert into umlauttest values (1, 'Vögel')

select * from umlauttest;
+----+--------+
| ID | term   |
+----+--------+
|  1 | Vögel | 
+----+--------+
1 row in set (0.00 sec)

show create table umlauttest;
+---------------------------+
| Table      | Create Table |
+---------------------------+
| umlauttest | CREATE TABLE `umlauttest` (
  `ID` bigint(20) NOT NULL AUTO_INCREMENT,
  `term` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=MyISAM AUTO_INCREMENT=2 DEFAULT CHARSET=utf8 | 
+---------------------------+
1 row in set (0.00 sec)
[8 Feb 2010 16:36] Andreas E.
OK me again!

After HOURS of searching several sources, I've finally pieced together a solution I can live with (for now):

1. do a 'chcp 1252' before starting mysql console client

2. set font to 'Lucida Console' in command prompt (it's really a crappy font, but you CANNOT use raster font mind you, because it does not support code page 1252!)

3. enjoy your working queries in mysql console!
[8 Feb 2010 16:44] Tonci Grgin
Andreas, I will now uncover my previously hidden comment. Using *proper* UTF8 console yields proper results. By no means does Win cmd.exe resemble a "proper console". Thus the need for such workarounds. And there might be more problems lurking there.

Re-closing the report as this has nothing to do with MySQL. Seemingly, people expect Windows to "just work". Well, they do not, in many cases.
[8 Feb 2010 16:55] Andreas E.
Tonci, I understood your ironic undertone very well, Sir.
But I think this is not the place to propagate your alternative OS by shouting "well on *my* OS it works". What a smart guy you are! I know very well it works perfectly on Linux, since I tested it there, too.

But since I encountered a Windows specific problem, I set OS to MS Windows Server 03.
With a *certain* reason, even.

And now please stop your anti-Windows rant, if you please.
(What are you doing in a Windows-only thread anyway? Go and be happy with your Solaris, I'd say.)
[8 Feb 2010 17:02] Tonci Grgin
Andreas, to the contrary... I'll quote myself:
"Seemingly, people expect Windows to "just work". Well, they do not, in many cases."

There is nothing offending here, at least I don't see it. This is just a fact. Further more, I suck at Linux and have spent my whole career on windows (yes, even now I'm writing on W2K8x64 box). There are bad and there are good things, depending on purpose.
But the fact remains, people expect Windows to "just work".

Finally, each and every one of us needs to familiarize it's self with tools used, be it OS or something else.

PS: You did not file bug at Microsoft, right ;-)
[9 Feb 2010 14:18] Andreas E.
No :) They don't care about bug reports either (such as MySQL AB at times ;))

OK I did even some more tests, and I say: IT IS A BUG.

Strangely enough I only need this codepage-1252 workaround if I have a database that already EXISTS.
This database was created with an external CREATE TABLE like

C:> mysql -u root -p < import.sql

All right up to this point.

With 850 set and normal raster font and everything possible set to utf8, I can create a database and a table and even use umlauts! INSIDE mysql.exe though. Then I don't get any HY... error anymore!

-------------------------------------------------------------------
The bug must be somewhere in the 'load data local infile' routine!
-------------------------------------------------------------------

The *SAME* database, as long it is not yet treated by load data infile, WORKS! Umlauts and all. Needs no workarounds with codepage 1252 and whatnot.
At the time 'load data local infile' comes in, the "destruction" begins:

a german "ü" gets transformed to ³ in the process, and the hex value is '0xFC', which means load data infile has been working internally in latin-1 mode!!
Even putting 'character set utf8' between 'into table' and 'fields' did NOT work. I also tried to CONVERT the input file itself to UTF8 and even this did not work.

Well, now as I have the actual culprit I'm really pondering about a new bug report.
[9 Feb 2010 16:03] Tonci Grgin
Andreas, what you described is actually not a bug. I'll ask Susanne to provide you with broader explanation.
[10 Feb 2010 9:45] Susanne Ebrecht
Andreas,

because you wrote the word "Vögel" I will now hope you are able to understand German.

I once blogged about that topic in German.

http://www.miracee.org/2009/03/30/encoding-handling-mysql-cli-5858398/
[18 Oct 2010 2:23] Caglar Orhan
Hi, now 2010 using 5.1.45 version of MySQL and there is no satisfying answer from the team. I found some thing ichanged Engine of my tables InnoDB to MyISAM  and MySQL accept my utf-8 chars.
May be that helps.
Regards
Caglaror
[7 Jun 2011 12:36] Kornel Makuszynski
I have 5.1.50-community version and same problem (with polish characters). Changing to MyISAM and changing everything (default collations, each of text fields in tables collations) to utf8_bin worked for me.
[22 Jul 2011 1:58] Shuiqing Li
I got same same error.
My question is: is there any way can skip this error and save the data into the table although the string is incorrect?
[9 Dec 2011 0:07] Arkadiy Kulev
This problem solves easily. Don't forget to not only set the database, table and collation to utf8, BUT THE COLUMNS ALSO!

That's what cause the problem for me. I created the table in latin, then switch to utf8, but forgot to also change the columns.