Bug #31162 Special character issue with Command Line on Vista
Submitted: 23 Sep 2007 20:14 Modified: 5 Mar 2009 18:29
Reporter: Peter Laursen (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S4 (Feature request)
Version:5.0.45 - any? OS:Microsoft Windows
Assigned to: Paul Dubois CPU Architecture:Any
Tags: qc
Triage: Triaged: D4 (Minor)

[23 Sep 2007 20:14] Peter Laursen
Description:
This may be client issue on Vista only!  I only have such system currently - no other Windows and no *nix!

I give it S1 category because if we cannot rely on Command Line client, using and developing for MySQL is practically impossible!

1)
CREATE TABLE `nordic` (                                  
          `id` bigint(20) NOT NULL auto_increment,               
          `field1` varchar(50) default NULL,                     
          `field2` varchar(50) default NULL,                     
          PRIMARY KEY  (`id`)                                    
        ) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1

2)
Now From command line do

insert into `nordic`(`field1`,`field2`) values ('Lars','Åsebø');

Now from a another client compiled with C-API/libmysql.dll (Query Browser, SQLyog) also do

insert into `nordic`(`field1`,`field2`) values ('Lars','Åsebø');

3)
Now from both clients do

Select * from Nordic

4) 

Nordic letters only display correctly in the client where they were entered!
If is definitely SQLyog/Query Browser that are correct and command line that is wrong here!

This is returned by command line after inserting (in that order) with SQLyog, QB and command line

mysql> select * from nordic;
+----+--------+--------+
| id | field1 | field2 |
+----+--------+--------+
|  1 | Lars   | ┼seb°  |
|  2 | Lars   | ┼seb°  |
|  3 | Lars   | Åsebø  |
+----+--------+--------+
3 rows in set (0.00 sec)

This is what the 2 graphical clients return:

    id  field1  field2  
------  ------  --------
     1  Lars    Åsebø 
     2  Lars    Åsebø 
     3  Lars    seb›

How to repeat:
See above. 

Additionally on commanline client

mysql> show variables like '%character%';
+--------------------------+----------------------------------------------------
-----+
| Variable_name            | Value
     |
+--------------------------+----------------------------------------------------
-----+
| character_set_client     | latin1
     |
| character_set_connection | latin1
     |
| character_set_database   | latin1
     |
| character_set_filesystem | binary
     |
| character_set_results    | latin1
     |
| character_set_server     | latin1
     |
| character_set_system     | utf8
     |
| character_sets_dir       | C:\Program Files\MySQL\MySQL Server 5.0\share\chars
ets\ |
+--------------------------+----------------------------------------------------
-----+
8 rows in set (0.00 sec)

mysql> insert into nordic (field1,field2) values ('Lars','Åsebø');
Query OK, 1 row affected (0.05 sec)

mysql> select * from nordic;
+----+--------+--------+
| id | field1 | field2 |
+----+--------+--------+
|  7 | Lars   | ┼seb°  |
|  8 | Lars   | ┼seb°  |
|  9 | Lars   | Åsebø  |
+----+--------+--------+
3 rows in set (0.00 sec)

mysql> show variables like 'character_set_client';
+----------------------+--------+
| Variable_name        | Value  |
+----------------------+--------+
| character_set_client | latin1 |
+----------------------+--------+
1 row in set (0.00 sec)

mysql> show variables like '%character_set_client%';
+----------------------+--------+
| Variable_name        | Value  |
+----------------------+--------+
| character_set_client | latin1 |
+----------------------+--------+
1 row in set (0.00 sec)

mysql> show variables like '%character%';
+--------------------------+----------------------------------------------------
-----+
| Variable_name            | Value
     |
+--------------------------+----------------------------------------------------
-----+
| character_set_client     | latin1
     |
| character_set_connection | latin1
     |
| character_set_database   | latin1
     |
| character_set_filesystem | binary
     |
| character_set_results    | latin1
     |
| character_set_server     | latin1
     |
| character_set_system     | utf8
     |
| character_sets_dir       | C:\Program Files\MySQL\MySQL Server 5.0\share\chars
ets\ |
+--------------------------+----------------------------------------------------
-----+
8 rows in set (0.00 sec)

On SQLyog 6.0 (that explicitly does "SET NAMES utf8" after connection) the same returns

Variable_name             Value                                                  
------------------------  -------------------------------------------------------
character_set_client      utf8                                                   
character_set_connection  utf8                                                   
character_set_database    latin1                                                 
character_set_filesystem  binary                                                 
character_set_results     utf8                                                   
character_set_server      latin1                                                 
character_set_system      utf8                                                   
character_sets_dir        C:\Program Files\MySQL\MySQL Server 5.0\share\charsets\

(On QB nothing is returned for that SHOW statement .. (is that by design?) so I can't paste it)

.. but charset for client, connection and rsult should really not matter as thos Nordic charcaters are valid latin1 and utf8 characters!

Suggested fix:
Do the nordic characters get inserted wrong in database from command line? 
What is happening? Does it SET NAMES wrong?
[23 Sep 2007 20:24] Peter Laursen
please replace the 'how to repeat paragraph! with this

Additionally on commanline client

mysql> show variables like '%character%';
+--------------------------+----------------------------------------------------
-----+
| Variable_name            | Value
     |
+--------------------------+----------------------------------------------------
-----+
| character_set_client     | latin1
     |
| character_set_connection | latin1
     |
| character_set_database   | latin1
     |
| character_set_filesystem | binary
     |
| character_set_results    | latin1
     |
| character_set_server     | latin1
     |
| character_set_system     | utf8
     |
| character_sets_dir       | C:\Program Files\MySQL\MySQL Server 5.0\share\chars
ets\ |
+--------------------------+----------------------------------------------------
-----+
8 rows in set (0.00 sec)

On SQLyog 6.0 (that explicitly does "SET NAMES utf8" after connection) the same returns

Variable_name             Value                                                  
------------------------  -------------------------------------------------------
character_set_client      utf8                                                   
character_set_connection  utf8                                                   
character_set_database    latin1                                                 
character_set_filesystem  binary                                                 
character_set_results     utf8                                                   
character_set_server      latin1                                                 
character_set_system      utf8                                                   
character_sets_dir        C:\Program Files\MySQL\MySQL Server 5.0\share\charsets\

(On QB nothing is returned for that SHOW statement .. (is that by design?) so I can't
paste it!)

.. but charset for client, connection and result should really not matter as those Nordic charcaters are both valid latin1 and utf8 characters!  It should be stored correctly and displayed correctly as long as the characters are supported by the charset specified with SET NAMES.

BTW: My system is a Danish Windows Vista! 
I am also surprised that character_set_system is 'utf8' 
-- but that could be due to my lack of knowledge!
[23 Sep 2007 20:43] Peter Laursen
on command line client:

mysql> set names utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into nordic (field1,field2) values ('Lars','Åsebø');
ERROR 1366 (HY000): Incorrect string value: '\x8Fseb\x9B' for column 'field2' at
 row 1
mysql>

but that is probably another issue .. that the command line client does not support unicode at all!
[25 Sep 2007 9:24] Peter Laursen
Now also reproduced on Win XP and MySQL 5.0.28.

Obviously command line and libmysql.dll store data different!

But what about other connectors?  PHP uses libmysql.dll, 
... but ODBC, JDBC, .NET, Perl, Python ??
[25 Sep 2007 10:03] Sveta Smirnova
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://dev.mysql.com/doc/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

This is limitation of Windows cmd.exe, not mysql command line client. Play with chcp command to get correct results for latin1 charset.
[25 Sep 2007 10:34] Peter Laursen
This is a very silly reply in my opinion.

I never asked you to implement the client on Windows in cmd.exe.
If cmd.exe is not appropriate you should implement with some other tool!
[25 Sep 2007 10:54] Peter Laursen
BTW: As I will have to "Play with chcp command" should I then realize that MySQL is a TOY?
[25 Sep 2007 12:27] Peter Laursen
I have to protect our customers against this issue with command line on Windows and have posted this to our FAQ:

http://webyog.com/faq/content/34/156/en/special-characters-entered-from-_mysql_-command-li...
[25 Sep 2007 21:02] Peter Laursen
I take myself the freedom to change to a documentation request entitled:

"How to use the 'mysql' client with special characers/localized charset on Windows"  

I think this is not documented (though I would prefer a more intelligent solution than just documentation!)?

The background is that we had a very serious complaint in our support system from a customer claiming that our program has corrupted his data.  He attached results from command line as documentation.  But on the opposite it came out, that he had corrupted his own data after 'correcting' from command line!

Only after verifying that Query Browser and SQLyog were in accordance, we realised that it was the command line and libmsql.dll that stored differently to the database.

Frankly I have also not been able to use what you replied here and in our FAQ.  I can change the codepage from command.exe, but I still cannot make it work with Nordic characters (æøåÆØÅ) in mysql client.  

If this is not a bug in terms of code, it still is in terms of usability! I am pretty confident that before such docs have been completed also MySQL AB will realize that this situation is not 'sustainable'!
[25 Sep 2007 21:32] Miguel Solorzano
Thank you for the feedback. I changed to Documentation category.
[14 Jan 2009 17:23] Paul Dubois
Given the fix to Bug#29323, is this report still a live issue?
[5 Mar 2009 8:45] Peter Laursen
This is an old report now! I hardly remember details!

I will try to explain as I remember.  'cmd.exe' is (after Windows 98) not a *fullblooded* (unix-type) command shell for the OS - basically it is an application only. It has (among other limitations) the limitation that multilingualism is not possible (I cannot enter Hindi characters in my Danish system etc.). 

But the problem is then that that seriously affects 'mysql' command line.  If 'mysql' was supposed to be the *ultimate* client interface for testing the server functionalities, it simply fails to do so on Windows! From MySQL 4.1 (with unicode support introduced) and onwards 'cmd.exe' was never fully functional as a client interface for the MySQL server. We need something better 'standard client interface' on Windows ideally.

That was my point.  It was badly formulated and I have learned since then. So you can close as you like.  But I hope that somebody will think a litle about 'cmd.exe' limitations for use with 'mysql'.

*not a bug* or *feature request for full unicode support in the 'mysql' client on Windows* - you can decide!
[5 Mar 2009 9:17] Peter Laursen
If you want to document anything it could be something like.

"the 'mysql' command line client is implemented in the shell or 'shell-like' interface avalable on each platform.  Or certain platforms that may result in limiations like <list> ..
[5 Mar 2009 9:53] Peter Laursen
If you want to document anything it could be something like.

"the 'mysql' command line client is implemented in the shell or 'shell-like' interface avalable on each platform.  Or certain platforms that may result in limiations like <list> ..
[5 Mar 2009 18:29] Paul Dubois
Peter, thanks for the feedback. I've decided to close this without action.
[3 May 2010 16:26] John Fisher
I too had problems with the mysql.exe client and garbled characters. The solution I found was to set the character encoding to cp850 (the character page used in Western-European Windows). For Nordic text it should be 'SET NAMES cp865;'. I got cp865 (Nordic) from here: http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/chcp.mspx?m...

It would be helpful to clearly document the fact that the Windows' mysql.exe client doesn't support UTF-8 (due to cmd.exe not supporting it). Under Windows the character set should be set to whatever the command 'chcp' (executed from a cmd.exe prompt) returns.
[3 May 2010 18:41] Shane Bester
the 5.5.3 client does this already:

H:\mysql\5.5\5.5.3\mysql-5.5.3-m3-win32\bin>mysql --no-defaults
<cut>
Connection:             localhost via TCP/IP
Server characterset:    latin1
Db     characterset:    latin1
Client characterset:    cp850
Conn.  characterset:    cp850
TCP port:               3306
Uptime:                 1 hour 5 min 1 sec