Bug #51504 Wrong encoding (charset) in function return
Submitted: 25 Feb 2010 14:09 Modified: 26 Feb 2010 11:07
Reporter: Anrii Pertsiukh Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.0.77 OS:Linux (CentOS)
Assigned to: CPU Architecture:Any
Tags: charset

[25 Feb 2010 14:09] Anrii Pertsiukh
Description:
    * Server version: 5.0.77
    * Protocol version: 10
    * Server: Localhost via UNIX socket
    * User: root@localhost
    * MySQL charset: UTF-8 Unicode (utf8)
phpMyAdmin - 2.11.9.5
    * MySQL client version: 5.0.77

DROP FUNCTION `generator`//
CREATE DEFINER=`root`@`localhost` FUNCTION `generator`(p_uid int) RETURNS text CHARSET utf8
    MODIFIES SQL DATA
    SQL SECURITY INVOKER
BEGIN
  DECLARE l_c, l_rnd integer default 123;

  DECLARE l_str VARCHAR (256) default '';

SELECT count(*) into l_c  FROM agent a 
  WHERE a.`agent_uin`=p_uid;

set l_str=concat('Здесь проблема с кодировкой если не добавлять число\n', 1);

If l_c=1 then
  set l_rnd=9;
  Update TableName 
    set FieldName1=l_rnd, FieldName12=now()
    WHERE FieldName3=p_uid;
  set l_str=concat('Здесь тоже:', l_rnd);
end if;

return l_str;
END

If run the function generator you will receive return string in different charsets 

How to repeat:
Using this function you will receive string with UTF8 encoding, and it is right, but if in string:

set l_str=concat('Здесь проблема с кодировкой если не добавлять число\n', 1);

 you will delete concat or change 1 on any Cyrillic symbol like this:

set l_str='Здесь проблема с кодировкой если не добавлять число\n';

or

set l_str=concat('Здесь проблема с кодировкой если не добавлять число\n', 'пример');

you will receive string with UTF8 charset in cp1251 encoding if your browser use cp1251
[25 Feb 2010 14:28] Valeriy Kravchuk
Thank you for the problem report. Please, send the results of:

show variables like 'char%';
show variables like 'coll%';

from phpMyAdmin.
[25 Feb 2010 15:55] Anrii Pertsiukh
show variables like 'char%';

Variable_name 	Value

character_set_client 	utf8

character_set_connection 	utf8

character_set_database 	utf8

character_set_filesystem 	binary

character_set_results 	utf8

character_set_server 	latin1

character_set_system 	utf8

character_sets_dir 	/usr/share/mysql/charsets/

show variables like 'coll%';

Variable_name 	Value

collation_connection 	utf8_unicode_ci

collation_database 	utf8_unicode_ci

collation_server 	latin1_swedish_ci
[26 Feb 2010 11:07] Susanne Ebrecht
Hello Anrii,

many thanks for writing a bug report.
Unfortunately, this is not a bug.

Charsets/Encodings is a quite similar to speak with foreigners.

You are Russian and I am German.

Your default language is Russian and mine German.

Additonally, my boss is from US and has default language English.

Let us imagine you are the environment ... e.g. the browser or even the terminal. I am the MySQL client and my boss is the server.

You start communicating with me.

I didn't get an information about what language you are talking in so I expect per default that all input that I will get is in German.

I have the information that my boss speaks English ... and so I will try to translate whatever you told me into English.

I will totally fail when the input I got from you was in Russian.

I could maybe translate "ananas" into "pineapple" but you will get tons of error message from me for most of the other words.

How do you solve this?

Very easy ...
We both first negotiate a language that we both are able to speak/understand

Means you will ask me: "Russki?"
Me: "No"
You: "English?"
Me: "Yes"

So we decide talking English with each other. And I am able to translate English into English for my boss and so he will get English from me.

The same we have between environment and client.

Your environment speaks cp1251 but your client thinks it is utf8 and so it has no chance to translate it proper to whatever server wants.

You can easy figure out what your client expect from your environment by using:

SELECT variables like 'char%';

The variables character_set_client, character_set_connection and character_set_result will tell you what your client expect that your environment is speaking.

If your environment is speaking a different encoding then you need to let the client know which encoding/charset your environment is speaking.

In your example it speaks cp1251 ... so you can use:

SET NAMES cp1251;

The second possibility is that you tell your environment that it should use the language that the client expect.

Means change your browser encoding to utf8.

More or minder it is the same when we want to talk to each other. When we both would be able to speak Russian we could talk in Russian. Unfortunately, I don't know Russian. When we both were able to speak German we could use German. I don't know if you are able to speak German.
We are both able to speak English ... that is why we both are talking in English.

You always need to find a common encoding/charset for environment and client.

Which charset the client is able to interpret you will see with:
SHOW CHARSET;