Bug #55606 Charset on ENUMS influencing performance too much
Submitted: 28 Jul 2010 15:25 Modified: 9 May 2013 20:22
Reporter: Michael Riediger Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Charsets Severity:S5 (Performance)
Version:5.1.47, 5.1.50-bzr, 5.5.23, 5.6.11 OS:Linux
Assigned to: Assigned Account CPU Architecture:Any
Tags: charset, enum, utf8

[28 Jul 2010 15:25] Michael Riediger
Description:
Since ENUMS are just a mapping, I didn't think the actual strings or charset would make much difference in performance. However, I am finding significant difference in simple query execution depending on what charset is chosen.

Sample simple query: "SELECT COUNT(*) FROM test_table WHERE field='yes'"

Generally speaking, queries going against the enum with a utf8 encoding take 50% longer than those against latin (or ascii) encoding.

The trend for slower execution against the table with utf8 enum is the same regardless of engine, indexing, or connection method. 

Similarly, execution is faster when the numeric equivalent of the enum is used in the query.

For example: "SELECT COUNT(*) FROM test_table WHERE field=1"

How to repeat:
mysql> CREATE DATABASE mysqlslap;

mysql> CREATE TABLE IF NOT EXISTS mysqlslap.test_table_utf8 
	( id int unsigned NOT NULL AUTO_INCREMENT,  
	field_a enum('yes','no') CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL DEFAULT 'no',  
	PRIMARY KEY (`id`), 
	KEY `field_a` (`field_a`) ) 
	ENGINE=InnoDB;
	
mysql> INSERT INTO mysqlslap.test_table_utf8 SELECT NULL, ROUND(RAND())+1;

< 10 times to get a random starting sample >

mysql> INSERT INTO mysqlslap.test_table_utf8 SELECT NULL, 3-field_a FROM mysqlslap.test_table_utf8;

< 16 times to make the table bigger with an equal number of yes's and no's>

mysql> CREATE TABLE IF NOT EXISTS mysqlslap.test_table_latin 
	( id int unsigned NOT NULL AUTO_INCREMENT,  
	field_a enum('yes','no') CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL DEFAULT 'no',  
	PRIMARY KEY (`id`), 
	KEY `field_a` (`field_a`) ) 
	ENGINE=InnoDB;

mysql> INSERT INTO mysqlslap.test_table_latin SELECT * FROM mysqlslap.test_table_utf8;
	
< Then I run some simple counting benchmarks via mysqlslap >
	
[root@beta2 mysql]# mysqlslap -ua -pz -S /var/lib/mysql_pc/mysql.sock --query="SELECT SQL_NO_CACHE COUNT(*) AS count FROM mysqlslap.test_table_utf8 WHERE field_a='yes'" -i200
        Average number of seconds to run all queries: 0.125 seconds

[root@beta2 mysql]# mysqlslap -ua -pz -S /var/lib/mysql_pc/mysql.sock --query="SELECT SQL_NO_CACHE COUNT(*) AS count FROM mysqlslap.test_table_latin WHERE field_a='yes'" -i200
        Average number of seconds to run all queries: 0.083 seconds

[root@beta2 mysql]# mysqlslap -ua -pz -S /var/lib/mysql_pc/mysql.sock --query="SELECT SQL_NO_CACHE COUNT(*) AS count FROM mysqlslap.test_table_utf8 WHERE field_a=1" -i200
        Average number of seconds to run all queries: 0.083 seconds

[root@beta2 mysql]# mysqlslap -ua -pz -S /var/lib/mysql_pc/mysql.sock --query="SELECT SQL_NO_CACHE COUNT(*) AS count FROM mysqlslap.test_table_latin WHERE field_a=1" -i200
        Average number of seconds to run all queries: 0.082 seconds

Suggested fix:
The enum / mapping control appears to be done more often than needed (i.e. once).
[28 Jul 2010 15:32] Michael Riediger
I tried the latest version too...
[29 Jul 2010 13:03] Valeriy Kravchuk
Please, send the results of

show variables like 'char%';

from your environment. I feel this may have something to do with character set conversion that happens with that 1 or "yes" constants.
[29 Jul 2010 13:15] Michael Riediger
On instance A

mysql>  show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

mysql>  show global variables like 'char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

On instance B

mysql> show global variables like 'char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | latin1                     |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

Both instances have the same (negative) behavior.

Changing the session character sets has no change in behavior that I've noticed.
[29 Jul 2010 13:21] Valeriy Kravchuk
Please, send the results of:

explain extended SELECT COUNT(*) FROM test_table WHERE field=1;
show warnings\G

explain extended SELECT COUNT(*) FROM test_table WHERE field='yes';
show warnings\G

from both instances for latin1 and utf8 tables.
[29 Jul 2010 13:26] Michael Riediger
mysql> EXPLAIN EXTENDED SELECT COUNT(*) FROM test_table_utf8 WHERE field_a='yes';
+----+-------------+-----------------+------+---------------+---------+---------+-------+--------+----------+--------------------------+
| id | select_type | table           | type | possible_keys | key     | key_len | ref   | rows   | filtered | Extra                    |
+----+-------------+-----------------+------+---------------+---------+---------+-------+--------+----------+--------------------------+
|  1 | SIMPLE      | test_table_utf8 | ref  | field_a       | field_a | 1       | const | 327892 |   100.00 | Using where; Using index |
+----+-------------+-----------------+------+---------------+---------+---------+-------+--------+----------+--------------------------+
1 row in set, 1 warning (0.03 sec)

mysql> show warnings;
+-------+------+--------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                  |
+-------+------+--------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | select count(0) AS `COUNT(*)` from `mysqlslap`.`test_table_utf8` where (`mysqlslap`.`test_table_utf8`.`field_a` = 'yes') |
+-------+------+--------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> EXPLAIN EXTENDED SELECT COUNT(*) FROM test_table_latin WHERE field_a='yes';
+----+-------------+------------------+------+---------------+---------+---------+-------+--------+----------+--------------------------+
| id | select_type | table            | type | possible_keys | key     | key_len | ref   | rows   | filtered | Extra                    |
+----+-------------+------------------+------+---------------+---------+---------+-------+--------+----------+--------------------------+
|  1 | SIMPLE      | test_table_latin | ref  | field_a       | field_a | 1       | const | 331524 |   100.00 | Using where; Using index |
+----+-------------+------------------+------+---------------+---------+---------+-------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> show warnings;
+-------+------+----------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                    |
+-------+------+----------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | select count(0) AS `COUNT(*)` from `mysqlslap`.`test_table_latin` where (`mysqlslap`.`test_table_latin`.`field_a` = 'yes') |
+-------+------+----------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
[29 Jul 2010 13:49] Valeriy Kravchuk
Verified just as described also with recent 5.1.50 from bzr on Mac OS X:

macbook-pro:5.1 openxs$ bin/mysqlslap -uroot --query="SELECT SQL_NO_CACHE COUNT(*) AS count FROM mysqlslap.test_table_utf8 WHERE field_a='yes'" -i200Benchmark
	Average number of seconds to run all queries: 0.779 seconds
	Minimum number of seconds to run all queries: 0.688 seconds
	Maximum number of seconds to run all queries: 0.831 seconds
	Number of clients running queries: 1
	Average number of queries per client: 1

macbook-pro:5.1 openxs$ bin/mysqlslap -uroot --query="SELECT SQL_NO_CACHE COUNT(*) AS count FROM mysqlslap.test_table_latin WHERE field_a='yes'" -i200
Benchmark
	Average number of seconds to run all queries: 0.636 seconds
	Minimum number of seconds to run all queries: 0.567 seconds
	Maximum number of seconds to run all queries: 0.720 seconds
	Number of clients running queries: 1
	Average number of queries per client: 1

So, with latin1 it is notably faster. I do not see any obvious explanation for this.
[29 Jun 2012 21:06] Michael Riediger
Slow down of about 50% still observed in 5.5.23.
[9 May 2013 20:22] Michael Riediger
Slow down of ~50% evident in 5.6.11 also.
[30 Aug 2015 17:21] Daniƫl van Eeden
With 5.7.8 on Linux I don't see a notable difference.
Note that I did an "OPTIMIZE TABLE <table>" to recreate both tables to negate any influence from about how the table was created.

[dvaneeden@dve-mac msb_5_7_8]$ ./my sqlslap --query="SELECT SQL_NO_CACHE COUNT(*) AS count FROM mysqlslap.test_table_utf8 WHERE field_a='yes'" -i200
Benchmark
	Average number of seconds to run all queries: 0.056 seconds
	Minimum number of seconds to run all queries: 0.054 seconds
	Maximum number of seconds to run all queries: 0.066 seconds
	Number of clients running queries: 1
	Average number of queries per client: 1

[dvaneeden@dve-mac msb_5_7_8]$ ./my sqlslap --query="SELECT SQL_NO_CACHE COUNT(*) AS count FROM mysqlslap.test_table_latin WHERE field_a='yes'" -i200
Benchmark
	Average number of seconds to run all queries: 0.055 seconds
	Minimum number of seconds to run all queries: 0.054 seconds
	Maximum number of seconds to run all queries: 0.081 seconds
	Number of clients running queries: 1
	Average number of queries per client: 1