Bug #13755 use of "set names" for unicode is required but not documented
Submitted: 4 Oct 2005 21:01 Modified: 16 Jan 2006 18:59
Reporter: Daniel McBrearty Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S2 (Serious)
Version:4.1 OS:Linux (Linux)
Assigned to: Paul DuBois CPU Architecture:Any

[4 Oct 2005 21:01] Daniel McBrearty
Description:
When I set up a table in 4.1 and give it a default charset of utf8, then enter valid utf8 values, mysql alters some of those values into non-utf8 values without telling me.

The cure is to issue the "set name 'utf8'" instruction at the start of the session.

Once this is done everything works fine.

But this requirement is not in the online manuals. I could not even find a description of "set names". It was thanks to some extrenal bloggers site that I was even able to fix the problem.

The user could be forgiven for thinking that a table defined with default charset utf8 will happily store and play back utf8 values.

Also see forum post:

http://forums.mysql.com/read.php?103,46870,47245#msg-47245

How to repeat:
see above

Suggested fix:
Please state the need to use "set names" in big letters in the unicode section of the manual. Make it obvious.

Also document "set names" and what it does.
[4 Oct 2005 21:10] MySQL Verification Team
Please see:

http://dev.mysql.com/doc/mysql/en/charset-connection.html

" There are two statements that affect the connection character sets:

SET NAMES 'charset_name'
SET CHARACTER SET charset_name

SET NAMES indicates what is in the SQL statements that the client sends. Thus, SET NAMES 'cp1251' tells the server “future incoming messages from this client are in character set cp1251.” It also specifies the character set for results that the server sends back to the client. (For example, it indicates what character set column values are if you use a SELECT statement.)

A SET NAMES 'x' statement is equivalent to these three statements:

mysql> SET character_set_client = x;
mysql> SET character_set_results = x;
mysql> SET character_set_connection = x;"
[4 Oct 2005 21:29] Daniel McBrearty
OK. Thanks for pointing me to that. However, if I go to the search box for the documentation and type "set names" in quotes that page does not come up anywhere near top! And there is no entry for "set names" in chapter 13 of the manual. So the two most common ways of finding the info (using search and using the index) don't work.

But the main point is : it is not particularly clear in the documentation that the client and server will make these translations. This is a pretty important point, and, while it may be apparent on a detailed reading of the manual, I think that you should consider clarifying the situation for new users by drawing attention to this feature early in the chapter on character sets.
[4 Oct 2005 21:34] Daniel McBrearty
For example, the opening of Chapter 10 could read:

<existing text>
 MySQL 4.1 and newer can do these things for you:
      Store strings using a variety of character sets
      Compare strings using a variety of collations
      Mix strings with different character sets or collations in the same server, the same database, or even the same table
      Allow specification of character set and collation at any level
</existing text>
<add>
NOTE : the mysql client and server programs may also make translations of character set at their input and output connections. This means that it is not enough merely to specify the default charset of your tables; you also need to make sure that both of these programs are configured appropriately.
See (link to appropriate page).
</add>

There -  I even wrote part of it for you.
[5 Oct 2005 0:23] MySQL Verification Team
Ok, I must agree with you isn't easy to find that information so
I am changing the category to Documentation for our Doc team
gives their position for.

Thank you for the feedback and requests done.
[10 Oct 2005 10:37] Daniel McBrearty
thanks a lot Miguel! I'm glad that this one will get fixed.
[16 Jan 2006 18:59] Paul DuBois
Thank you for your bug report. This issue has been addressed in the
documentation. The updated documentation will appear on our website
shortly, and will be included in the next release of the relevant
product(s).

Additional info:

Updated http://dev.mysql.com/doc/refman/5.0/en/charset.html
to point out the problem, with a cross-reference to the section
that discussed the connection system variables.