Bug #74083 ENUM index scan is slower than INT index scan
Submitted: 25 Sep 2014 18:11 Modified: 9 Oct 2014 12:52
Reporter: Vojtech Kurka Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.6.20, 5.6.22 OS:Linux (CentOS 6.5)
Assigned to: CPU Architecture:Any
Tags: enum, index scan, INT, Optimizer, slow

[25 Sep 2014 18:11] Vojtech Kurka
Description:
ENUM is internally stored as a 2-BYTE integer, so an index scan on that column should be faster than an index scan on INT (4-byte integer) column. 

It's not that case, ENUM is slower (CPU-bound test):

mysql> SELECT COUNT(*) FROM test_item WHERE `statusInt` = 1;
+----------+
| COUNT(*) |
+----------+
|  5356572 |
+----------+
1 row in set (1.13 sec)

mysql> SELECT COUNT(*) FROM test_item WHERE `status` = 'active';
+----------+
| COUNT(*) |
+----------+
|  5356572 |
+----------+
1 row in set (1.44 sec)

The penalty is much worse when the table has more columns with more data. If I am right about the cause of the problem, the penalty will be huge for large tables, when the workload is I/O bound.

How to repeat:
CREATE TABLE `test_item` (
  `itemId` int(10) unsigned NOT NULL,
  `status` enum('active','gone') NOT NULL,
  `statusInt` int(10) unsigned NOT NULL,
  `name1` varchar(255) NOT NULL,
  `name2` varchar(255) NOT NULL,
  PRIMARY KEY (`itemId`),
  KEY `status` (`status`),
  KEY `statusInt` (`statusInt`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT of 18 milliion rows, data distribution is like:

mysql> SELECT `status`, `statusInt`, COUNT(*)
    -> FROM test_item
    -> GROUP BY `status`, `statusInt`;
+--------+-----------+----------+
| status | statusInt | COUNT(*) |
+--------+-----------+----------+
| active |         1 |  5356572 |
| gone   |         0 | 12784924 |
+--------+-----------+----------+
2 rows in set (12.27 sec)

I can send you gzipped dump of that table, but it's about 660MB.

Suggested fix:
I assume that the problem is optimizer related, because the ENUM query is not just an index scan, the explain shows "Using where":

mysql> EXPLAIN SELECT COUNT(*) FROM test_item WHERE `status` = 'active';
+----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+
| id | select_type | table     | type | possible_keys | key    | key_len | ref   | rows    | Extra                    |
+----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+
|  1 | SIMPLE      | test_item | ref  | status        | status | 1       | const | 8552376 | Using where; Using index |
+----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+
1 row in set (0.03 sec)

mysql> EXPLAIN SELECT COUNT(*) FROM test_item WHERE `statusInt` = 1;
+----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+
| id | select_type | table     | type | possible_keys | key       | key_len | ref   | rows    | Extra       |
+----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+
|  1 | SIMPLE      | test_item | ref  | statusInt     | statusInt | 4       | const | 8552376 | Using index |
+----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+
1 row in set (0.00 sec)
[25 Sep 2014 18:29] Vojtech Kurka
I'm uploading the dump file to: /support/incoming/mysql-bug-data-74083_2.sql.gz
It will take about 1 hour.
[9 Oct 2014 12:52] MySQL Verification Team
Hello Vojtech Kurka,

Thank you for the report and test case.

Thanks,
Umesh
[9 Oct 2014 12:53] MySQL Verification Team
// 5.6.22

mysql> analyze table test_item;
+----------------+---------+----------+----------+
| Table          | Op      | Msg_type | Msg_text |
+----------------+---------+----------+----------+
| test.test_item | analyze | status   | OK       |
+----------------+---------+----------+----------+
1 row in set (0.13 sec)

mysql> flush status;
Query OK, 0 rows affected (0.03 sec)

mysql> SELECT COUNT(*) FROM test_item WHERE `statusInt` = 1;
+----------+
| COUNT(*) |
+----------+
|  5356572 |
+----------+
1 row in set (10.83 sec)

mysql> show status like 'Handler_read%';
+-----------------------+---------+
| Variable_name         | Value   |
+-----------------------+---------+
| Handler_read_first    | 0       |
| Handler_read_key      | 1       |
| Handler_read_last     | 0       |
| Handler_read_next     | 5356572 |
| Handler_read_prev     | 0       |
| Handler_read_rnd      | 0       |
| Handler_read_rnd_next | 0       |
+-----------------------+---------+
7 rows in set (0.00 sec)

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT COUNT(*) FROM test_item WHERE `status` = 'active';
+----------+
| COUNT(*) |
+----------+
|  5356572 |
+----------+
1 row in set (11.84 sec)

mysql> show status like 'Handler_read%';
+-----------------------+---------+
| Variable_name         | Value   |
+-----------------------+---------+
| Handler_read_first    | 0       |
| Handler_read_key      | 1       |
| Handler_read_last     | 0       |
| Handler_read_next     | 5356572 |
| Handler_read_prev     | 0       |
| Handler_read_rnd      | 0       |
| Handler_read_rnd_next | 0       |
+-----------------------+---------+
7 rows in set (0.00 sec)

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM test_item WHERE `statusInt` = 1;
+----------+
| COUNT(*) |
+----------+
|  5356572 |
+----------+
1 row in set (10.78 sec)

mysql> show status like 'Handler_read%';
+-----------------------+---------+
| Variable_name         | Value   |
+-----------------------+---------+
| Handler_read_first    | 0       |
| Handler_read_key      | 1       |
| Handler_read_last     | 0       |
| Handler_read_next     | 5356572 |
| Handler_read_prev     | 0       |
| Handler_read_rnd      | 0       |
| Handler_read_rnd_next | 0       |
+-----------------------+---------+
7 rows in set (0.00 sec)

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) FROM test_item WHERE `status` = 'active';
+----------+
| COUNT(*) |
+----------+
|  5356572 |
+----------+
1 row in set (11.83 sec)

mysql> explain SELECT COUNT(*) FROM test_item WHERE `statusInt` = 1;
+----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+
| id | select_type | table     | type | possible_keys | key       | key_len | ref   | rows    | Extra       |
+----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+
|  1 | SIMPLE      | test_item | ref  | statusInt     | statusInt | 4       | const | 9010328 | Using index |
+----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+
1 row in set (0.04 sec)

mysql> explain SELECT COUNT(*) FROM test_item WHERE `status` = 'active';
+----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+
| id | select_type | table     | type | possible_keys | key    | key_len | ref   | rows    | Extra                    |
+----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+
|  1 | SIMPLE      | test_item | ref  | status        | status | 1       | const | 9010328 | Using where; Using index |
+----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+
1 row in set (0.00 sec)