Bug #41511 Windows 64 console uses different cyrillic charsets for input and output
Submitted: 16 Dec 2008 14:53 Modified: 18 Dec 2008 10:10
Reporter: Konstantin Yegupov Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Command-line Clients Severity:S3 (Non-critical)
Version:5.1.26-rc-community OS:Windows (x64)
Assigned to: CPU Architecture:Any
Tags: charset, cli, console, cyrillic, russian

[16 Dec 2008 14:53] Konstantin Yegupov
Description:
Historically, Windows with Russian language settings uses CP866 ("DOS") charset for console. 32-bit MySQL CLI respects this convention.

64-bit MySQL CLI, however, uses CP1251 ("Windows") charset for printing values, but CP866 for inputing them. This is very inconvenient.

How to repeat:
Can be reproduced using CLI on Windows 64-bit.

The simplest test can be performed like this:

mysql> set names cp866;
Query OK, 0 rows affected (0.00 sec)

mysql> select 'тест';
+------+
| в?бв     |
+------+
| в?бв     |
+------+
1 row in set (0.00 sec)

Note that string has become corrupted.

The simplest workaround looks like this:

mysql> set names cp866;
Query OK, 0 rows affected (0.00 sec)

mysql> set @a='тест';
Query OK, 0 rows affected (0.00 sec)

mysql> set names cp1251;
Query OK, 0 rows affected (0.00 sec)

mysql> select @a;
+------+
| @a   |
+------+
| тест     |
+------+
1 row in set (0.00 sec)

Note that changing charset resulted in displaying the string correctly.

Suggested fix:
64-bit Windows CLI should use CP866 for both input and output.
[18 Dec 2008 9:26] Susanne Ebrecht
Many thanks for pointing this out.

Let me define some names first, so that you know what I mean:

environment encoding: That is the encoding your environment is using. You already figured out it is codepage 866

client encoding: that is the encoding what the client expect that the environment has. This always has to be set manually from the user, doesn't matter which software. The server is looking what is the client encoding and will transfer it fully automatic and transparent to the encoding what the server is using at the end. In MySQL client encoding are the variables character_set_client, character_set_connection and character_set_result. You can change this three variables by using SET NAMES <charset name>.

The variable character_set_client is telling the server which encoding your client is using. Means, which encoding is used from keyboard input. When this variable don't match to your environment encoding you won't get displayed output correct and server will interpret input wrong.

The variable character_set_result let the server know to which character set it has to encode results from i.e. SELECT statements.

There are historical reason why MySQL is needing both variables for SELECT results. Usually character_set_result and character_set_client should have the same value.

The variable character_set_connection mostly is used for connectors like JDBC or ODBC.

You already understood that you have to use: SET NAMES cp866 for your system. That is totally ok.

That MySQL is using other default values here is just a configuration failure.

You can add here in my.ini at the client section: default-character-set=cp866

Also you could use MySQL Server Instance Config Wizard to change default charset.

Fourth alternative is that you start the CLI by using: mysql --default-character-set=cp866

Unfortunately, you hit a bug here anyway. I will let you know the bug number after I filled the bug because the result of my tests were that the codepage charsets aren't working as default.
[18 Dec 2008 10:10] Susanne Ebrecht
I will set this bug as duplicate of bug #41583

I know this bug report here is not a bug and it is not exactly a duplicate. But it is related to the other bug.