Bug #72882 Metadata character set causes unnecessary cost for cpu-bound scene
Submitted: 5 Jun 2014 2:24 Modified: 25 Jul 2014 19:51
Reporter: Hao Liu Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Charsets Severity:S5 (Performance)
Version:5.5 5.6 OS:Linux
Assigned to: CPU Architecture:Any

[5 Jun 2014 2:24] Hao Liu
Description:
I meet a scene that the  character_set_system value causes problem.

Our server's character set is gbk and our client's character set is also gbk, as below:

root@(none) 10:16:42>show variables like '%character%';
+--------------------------+-----------------------------+
| Variable_name            | Value                       |
+--------------------------+-----------------------------+
| character_set_client     | gbk                         |
| character_set_connection | gbk                         |
| character_set_database   | gbk                         |
| character_set_filesystem | binary                      |
| character_set_results    | gbk                         |
| character_set_server     | gbk                         |
| character_set_system     | utf8                        |
| character_sets_dir       | /u01/my3706/share/charsets/ |
+--------------------------+-----------------------------+

but the metadata's character set is utf8 and can't be changed. Our benchmark is readonly cpu-bound test, and the my_convert or copy_and_convert costs about 2+% CPU. It causes about 5% QPS descends.

The perf top result for 5.5:

            10619.00  4.9% _Z10MYSQLparsePv                                         /u01/mysql/bin/mysqld
             4969.00  2.3% _spin_lock_bh                                            [kernel.kallsyms]
             4658.00  2.1% _Z16copy_and_convertPcjP15charset_info_stPKcjS1_Pj       /u01/mysql/bin/mysqld
             4462.00  2.1% memcpy                                                   /lib64/libc-2.12.so

The perf top result for 5.6:

             9963.00  5.1% _Z10MYSQLparsePv                                                                                  mysqld
             4694.00  2.4% memcpy                                                                                            libc-2.12.so
             4375.00  2.2% my_convert                                                                                        mysqld
             4321.00  2.2% my_hash_sort_utf8                                                                                 mysqld

How to repeat:
Set the server and client's character set to gbk and run a read-only cpu-bound benchmark, and use perf top to analyze it to see the result.

Suggested fix:
Make the character_set_system changeable.
[25 Jul 2014 19:51] Sveta Smirnova
Thank you for the reasonable feature request.

Thought I see potential issue with such a solution for users who switch default character set for the server from one to another.