Bug #53647 Charset information lost on clusters automatic database propagation
Submitted: 14 May 2010 13:55 Modified: 14 May 2010 14:02
Reporter: Hartmut Holzgraefe Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1-telco-7.0 OS:Linux
Assigned to: CPU Architecture:Any
Tags: all?, mysql-cluster-7.0.14

[14 May 2010 13:55] Hartmut Holzgraefe
Description:
When changing the default character set and/or collation settings using the charater_set_server and collation_server variables these settings are used as the default character set and collation for newly created databases.

On all other mysqld servers in the cluster the database is created with the compiled in defaults (usually latin1/latin1_swedish_ci) no matter how the 
charater_set_server and collation_server variables are set.

How to repeat:
- Set up a test cluster with two mysqld nodes

- connect to both mysqld nodes and issue 

    set global character_set_server='utf8';
    set global collation_server='utf8_bin';

  then disconnect again.

- connect to mysqld #1 and create a database using default settings

    CREATE DATABASE d1;

- check the database charset / collation using

    SELECT schema_name, default_character_set_name, default_collation_name 
      FROM information_schema.schemata
     WHERE schema_name = 'd1';

  +-------------+----------------------------+------------------------+
  | schema_name | default_character_set_name | default_collation_name |
  +-------------+----------------------------+------------------------+
  | d1          | utf8                       | utf8_bin               |
  +-------------+----------------------------+------------------------+

- now check the result on the 2nd mysqld

  +-------------+----------------------------+------------------------+
  | schema_name | default_character_set_name | default_collation_name |
  +-------------+----------------------------+------------------------+
  | d1          | latin1                     | latin1_swedish_ci      |
  +-------------+----------------------------+------------------------+

- verify that the 2nd mysqld does indeed have utf8 as default charset

   CREATE DATABASE d2;

   SELECT schema_name, default_character_set_name, default_collation_name        
     FROM information_schema.schemata      
    WHERE schema_name like  'd_';

  +-------------+----------------------------+------------------------+
  | schema_name | default_character_set_name | default_collation_name |
  +-------------+----------------------------+------------------------+
  | d1          | latin1                     | latin1_swedish_ci      |
  | d2          | utf8                       | utf8_bin               |
  +-------------+----------------------------+------------------------+

- now change back to mysqld #1 and perform the same I_S query

  +-------------+----------------------------+------------------------+
  | schema_name | default_character_set_name | default_collation_name |
  +-------------+----------------------------+------------------------+
  | d1          | utf8                       | utf8_bin               |
  | d2          | latin1                     | latin1_swedish_ci      |
  +-------------+----------------------------+------------------------+

When explicitly giving a default charset/collation in the CREATE statement things are fine though, only with implicit settings the result differs between mysqlds in the cluster.

Suggested fix:
Honor charset and collation settings even when they are not given explicitly
[27 May 2010 14:20] Hartmut Holzgraefe
This doesn't affect cluster tables as they are always replicated with the right charset/collation settings, but Events, Procedures, maybe Triggers, that need to be created on each mysqld node separately.

One may end up with "Illegal mix of collations" errors about latin1-swedish-ci from within Events or Procedures that way even though all charset related settings have been changed so that the compiled in default latin1-swedish-ci should actually never be seen ...