Bug #12120 MySQL client always sends build-in charset when connecting
Submitted: 22 Jul 2005 17:10 Modified: 3 Aug 2005 16:41
Reporter: Alexander Drozdov Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: C API (client library) Severity:S4 (Feature request)
Version:4.1 OS:Linux (Linux)
Assigned to: CPU Architecture:Any

[22 Jul 2005 17:10] Alexander Drozdov
Description:
If we take the standard build of the server then mysql server and client's charset will be latin1. We can set character-set-server (AKA default_character_set) variable to, for example, utf8.

In this case the result of "show global variables like 'char%';" command will be <all is utf8>. But MySQL client library (libmysqlclient) doesn't read the my.cnf file and ALWAYS sends "latin1" when connecting to server. As a result, "show variables like 'char%';" shows that character_set_client, character_set_connection and character_set_results is "latin1".

The MySQL server allows empty charset received from MySQL client, and in this case all the character_set_* variables will be "default_character_set" variable (you can see sql_parse.cc:817):
    /*
      Use server character set and collation if
      - client has not specified a character set
      - client character set is the same as the servers
      - client character set doesn't exists in server
    */
    if (!(thd->variables.character_set_client=
          get_charset((uint) net->read_pos[8], MYF(0))) ||
        !my_strcasecmp(&my_charset_latin1,
                       global_system_variables.character_set_client->name,
                       thd->variables.character_set_client->name))
    {
      thd->variables.character_set_client=
        global_system_variables.character_set_client;
      thd->variables.collation_connection=
        global_system_variables.collation_connection;
      thd->variables.character_set_results=
        global_system_variables.character_set_results;
    }
    else
    {
      thd->variables.character_set_results=
      thd->variables.collation_connection= 
        thd->variables.character_set_client;
    }

MySQL client library is used not only by "mysql" program (which reads [client] and [mysql] sections), but, for example, PHP.

There is no way to set a default charset on the server side without recompiling something.

How to repeat:
1. Set the default_character_set variable in [mysqld] section of my.cnf file;
2. Connect to server via PHP or C program and use functions:
   2.1. real_mysql_connect to just connect to the server;
   2.2. mysql_query("show global variables like 'charset%'").

There is no way to set a default charset on the server side without recompiling something.

Suggested fix:
Not to send the build-in charset from MySQL client to MySQL server when connecting OR to read the my.cnf file from the "default" location before trying to connect.
[22 Jul 2005 19:51] Alexander Drozdov
You can set init_connect variable on server side to "SET NAMES utf8;" as workaround but IMHO it is a wrong way to solve the problem.
[22 Jul 2005 21:31] MySQL Verification Team
Sorry I don't understand well what you meant:

C:\temp>type c:\my.cnf
[mysqld]
basedir=c:/mysql
datadir=c:/mysql/data
default_character_set=utf8

C:\temp>bug12120
Server version: 4.1.12a-nt
[character_set_client] [cp1251]
[character_set_connection] [cp1251]
[character_set_database] [utf8]
[character_set_results] [cp1251]
[character_set_server] [utf8]
[character_set_system] [utf8]
[character_sets_dir] [c:\mysql\share\charsets/]

#include <stdlib.h>
#include <stdio.h>
#include <my_global.h>
#include "mysql.h"

int main()
{
  MYSQL* mysql;
  MYSQL_RES* res;
  MYSQL_ROW	row ;
  unsigned int num_fields,i;
  
  mysql= mysql_init(NULL);
  mysql_options(mysql,MYSQL_SET_CHARSET_NAME,"cp1251");
  mysql_real_connect(mysql,"localhost","root","","test",0,NULL,0);
  printf("Server version: %s\n", mysql_get_server_info(mysql));
  mysql_query(mysql,"show variables like 'character%'");
  res = mysql_store_result(mysql);
  num_fields = mysql_num_fields(res);

  while ((row = mysql_fetch_row(res)))
  {
    unsigned long *lengths;
    lengths = mysql_fetch_lengths(res);
    for(i = 0; i < num_fields; i++)
    {
       printf("[%.*s] ", (int) lengths[i], row[i] ? row[i] : "NULL");
    }
    printf("\n");
  }
  mysql_free_result(res);
  mysql_close(mysql);
  return 0;
}
[23 Jul 2005 5:33] Alexander Drozdov
I mean that there is no way to set default charset on the _server_ side. What if you don't call mysql_options ?

Many programs have no special option to set connection options. For PHP, mysqli_options is available only in PHP5 (by the documentation). Therefore, it is nedeed to modify those programs if the build-in charset sent by client is bad.
[25 Jul 2005 21:58] Aleksey Kishkin
Sorry but  server must not set client's charset. For instance in russian environment it's not rare case when one server has connections from linux clients (with koi8r charset) and from windows clients (with cp1251). 

About PHP: we do not maintain PHP. Please use php.net  for any php-related questions.

Can you state your suggestion more detailed?
[26 Jul 2005 6:35] Alexander Drozdov
I see that server must not set a client charset. But it might be possible not to set the default (build-in) charset in a client if there is no mysql_options() call. You may see from my first message that the server already can determine if a client has not sent any charset.

In this case, if the server is used by clients with different charset than they will use mysql_options(), "set names" etc. But if the server is used by clients with only charset than it is not needed to set a charset on a client (as in 4.0).
[2 Aug 2005 9:24] Aleksey Kishkin
so, yuo suggestion is  - client library must analyze current locale and set client charset according to it (and mustnot always set latin1)?  I agree, it could be convenient. If so, I'll set this bug report severity as 'feature request',  because the problem you noticed is expected behavour.
[2 Aug 2005 14:24] Alexander Drozdov
My suggestion is not to set (and not to send) a client charset (by _client_) by default. But your decision is nice too (it is not so convenient for me because I will need to modify the startup scripts of my programs using libmysql, but it is nice because I won't need to change those programs). You can set 'feature request' severity.