Bug #87717 character_set_server is not dynamic and requires restart server
Submitted: 9 Sep 2017 10:08 Modified: 22 Sep 2017 5:24
Reporter: Ke Lu Email Updates:
Status: Not a Bug Impact on me:
None 
Category:Connector / J Severity:S3 (Non-critical)
Version:5.1.35 ~ 5.1.44 OS:Linux
Assigned to: Alexander Soklakov CPU Architecture:Any
Tags: charset

[9 Sep 2017 10:08] Ke Lu
Description:
character_set_server  and collation_server are documented as dynamic variables, but in my case, it's not actually, at least for Connector/J.

We need to support utf8mb4 characters in our app, when i change the two variables, and the session  character_set_server  is still utf8 and cause  illegal mix of collations errors.

How to repeat:
I tried Connector/J version 5.1.35, 5.1.39, 5.1.42, 5.1.34, 5.1.44.
MySQL server version are 5.6.23, 5.7.15.

First start MySQL server with these configurations:
[mysqld]
character_set_server=utf8
collation-server=utf8_general_ci

Use Connector/J do a very simple operation :
select @@ character_set_connection;

This will return utf8 correctly.

Then set the above variables:
mysql> set global character_set_server=utf8mb4;
mysql>set global collation-server=utf8mb4_unicode_ci;
mysql>set global character_set_client=utf8mb4;
mysql>set global character_set_connection=utf8mb4;
mysql>set global character_set_results=utf8mb4;

after this,run select @@character_set_connection again through Connector/J,
the result still be utf8 but not utf8mb4. On the same time if i use the native mysql client issued together with the server, it does be changed to utf8mb4.

restart MySQL server with the following configuration:
[mysqld]
character_set_server=utf8mb4
collation-server=utf8mb4_unicode_ci

Now it's utf8mb4.

Suggested fix:
I look the code in ConnectionImpl.java, it use serverCharsetIndex to determine
whether to run SET NAMES utf8mb4 and it seems it read some value not changeable once the server started. I have not deeply investigate where is value from, but i think it should read some changeable variables through the server representing the change of server charset. 

boolean utf8mb4Supported = versionMeetsMinimum(5, 5, 2);
                            boolean useutf8mb4 = utf8mb4Supported && (CharsetMapping.UTF8MB4_INDEXES.contains(this.io.serverCharsetIndex));

                            if (!getUseOldUTF8Behavior()) {
                                if (dontCheckServerMatch || !characterSetNamesMatches("utf8") || (utf8mb4Supported && !characterSetNamesMatches("utf8mb4"))) {
                                    execSQL(null, "SET NAMES " + (useutf8mb4 ? "utf8mb4" : "utf8"), -1, null, DEFAULT_RESULT_SET_TYPE,
                                            DEFAULT_RESULT_SET_CONCURRENCY, false, this.database, null, false);
                                }
                            }
[15 Sep 2017 13:33] Chiranjeevi Battula
Hello  Luke,

Thank you for the bug report.
Can you please clarify more on what you are trying?

1) Connect to server with Connector/J, change values on server and try the connection charset on the same connection.
or
2) Connect to server with Connector/J, change values on server, reconnect and try the connection charset on the new connection.

Connector/J can handle mixed charsets in results, which means you can create some tables or columns in utf8mb4.

Thanks,
Chiranjeevi.
[16 Sep 2017 5:17] Ke Lu
Hi,
I mean the second one:
Connect to server with Connector/J, change values on server, reconnect and try the connection charset on the new connection.

dynamically setting the character_set_server=utf8mb4 and collation-server=utf8mb4_unicode_ci will not work for the new connection, the charset for the new connection is still utf8.

We can insert utf8mb4 chars into table , but when you compare a utf8mb4 column to some string it will cause illegal mix of collations errors.

e.g.

select * from some_table where utf8mb4_column = 'aaaa';

because collate for utf8mb4 column using utf8mb4_unicode_ci and 'aaaa' use 
utf8_general_ci as the connect charset is still utf8.
[18 Sep 2017 6:50] Chiranjeevi Battula
Hello Luke,

Thank you for your feedback.
Verified based on internal discussion with dev's.

Thanks,
Chiranjeevi.
[19 Sep 2017 13:00] Alexander Soklakov
Hi Luke,

This issue is caused by server. Though you change the character_set_* variables globally the server still sends an original charset and collation number in a protocol greeting, thus c/J does a mistake about actual server state. I'm not sure if it's an intentional behaviour, looks like a server bug for me.

The difference between MySQL client and c/J is that c/J does additional charset configuration work at the beginnig to reduce the number of intermediate data conversion on results retrieving.

And c/J can handle your case, you just need to use connectionCollation=utf8mb4_unicode_ci connection option on secondary connections. Please, look at example below.

    public void testBug87717() throws Exception {

        Properties p = new Properties();
        p.setProperty("useSSL", "false"); // to check on-wire packets in clear text

        System.out.println("\nOriginal server state");
        System.out.println("=====================");

        Statement s1 = getConnectionWithProps(p).createStatement();
        ResultSet rs1 = s1.executeQuery("show variables like '%chara%'");
        while (rs1.next()) {
            System.out.println(rs1.getString(1) + " = " + rs1.getString(2));
        }
        rs1 = s1.executeQuery("show variables like '%colla%'");
        while (rs1.next()) {
            System.out.println(rs1.getString(1) + " = " + rs1.getString(2));
        }

        s1.execute("set global character_set_server=utf8mb4");
        s1.execute("set global collation_server=utf8mb4_unicode_ci");
        s1.execute("set global character_set_client=utf8mb4");
        s1.execute("set global character_set_connection=utf8mb4");
        s1.execute("set global character_set_results=utf8mb4");

        System.out.println("\nModified server state");
        System.out.println("=====================");

        // do not use this property to avoid unnecessary SET NAMES query 
        //p.setProperty("characterEncoding", "UTF-8");

        p.setProperty("connectionCollation", "utf8mb4_unicode_ci");

        Connection c2 = getConnectionWithProps(p);
        ResultSet rs2 = c2.createStatement().executeQuery("show variables like '%chara%'");
        while (rs2.next()) {
            System.out.println(rs2.getString(1) + " = " + rs2.getString(2));
        }
        rs2 = c2.createStatement().executeQuery("show variables like '%colla%'");
        while (rs2.next()) {
            System.out.println(rs2.getString(1) + " = " + rs2.getString(2));
        }

        // restoring original state
        s1.execute("set global character_set_server=utf8");
        s1.execute("set global collation_server=utf8_general_ci");
        s1.execute("set global character_set_client=utf8");
        s1.execute("set global character_set_connection=utf8");
        s1.execute("set global character_set_results=utf8");
    }

And the output is:

Original server state
=====================
character_set_client = utf8
character_set_connection = utf8
character_set_database = latin1
character_set_filesystem = binary
character_set_results = 
character_set_server = utf8
character_set_system = utf8
character_sets_dir = /opt/mysql-5.7.17-linux-glibc2.5-x86_64/share/charsets/
collation_connection = utf8_general_ci
collation_database = latin1_swedish_ci
collation_server = utf8_general_ci

Modified server state
=====================
character_set_client = utf8
character_set_connection = utf8mb4
character_set_database = latin1
character_set_filesystem = binary
character_set_results = 
character_set_server = utf8mb4
character_set_system = utf8
character_sets_dir = /opt/mysql-5.7.17-linux-glibc2.5-x86_64/share/charsets/
collation_connection = utf8mb4_unicode_ci
collation_database = latin1_swedish_ci
collation_server = utf8mb4_unicode_ci
[22 Sep 2017 3:11] Ke Lu
Yes,set connectionCollation to utf8mb4_unicode_ci does solve this problem although i have already restarted my production server with new configuration. Thanks!
[22 Sep 2017 5:24] Alexander Soklakov
Luke,

Thanks for the confirmation!