| Bug #74083 | ENUM index scan is slower than INT index scan | ||
|---|---|---|---|
| Submitted: | 25 Sep 2014 18:11 | Modified: | 9 Oct 2014 12:52 |
| Reporter: | Vojtech Kurka | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
| Version: | 5.6.20, 5.6.22 | OS: | Linux (CentOS 6.5) |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | enum, index scan, INT, Optimizer, slow | ||
[25 Sep 2014 18:29]
Vojtech Kurka
I'm uploading the dump file to: /support/incoming/mysql-bug-data-74083_2.sql.gz It will take about 1 hour.
[9 Oct 2014 12:52]
MySQL Verification Team
Hello Vojtech Kurka, Thank you for the report and test case. Thanks, Umesh
[9 Oct 2014 12:53]
MySQL Verification Team
// 5.6.22 mysql> analyze table test_item; +----------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +----------------+---------+----------+----------+ | test.test_item | analyze | status | OK | +----------------+---------+----------+----------+ 1 row in set (0.13 sec) mysql> flush status; Query OK, 0 rows affected (0.03 sec) mysql> SELECT COUNT(*) FROM test_item WHERE `statusInt` = 1; +----------+ | COUNT(*) | +----------+ | 5356572 | +----------+ 1 row in set (10.83 sec) mysql> show status like 'Handler_read%'; +-----------------------+---------+ | Variable_name | Value | +-----------------------+---------+ | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_last | 0 | | Handler_read_next | 5356572 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+---------+ 7 rows in set (0.00 sec) mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> SELECT COUNT(*) FROM test_item WHERE `status` = 'active'; +----------+ | COUNT(*) | +----------+ | 5356572 | +----------+ 1 row in set (11.84 sec) mysql> show status like 'Handler_read%'; +-----------------------+---------+ | Variable_name | Value | +-----------------------+---------+ | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_last | 0 | | Handler_read_next | 5356572 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+---------+ 7 rows in set (0.00 sec) mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> SELECT SQL_NO_CACHE COUNT(*) FROM test_item WHERE `statusInt` = 1; +----------+ | COUNT(*) | +----------+ | 5356572 | +----------+ 1 row in set (10.78 sec) mysql> show status like 'Handler_read%'; +-----------------------+---------+ | Variable_name | Value | +-----------------------+---------+ | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_last | 0 | | Handler_read_next | 5356572 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+---------+ 7 rows in set (0.00 sec) mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> SELECT SQL_NO_CACHE COUNT(*) FROM test_item WHERE `status` = 'active'; +----------+ | COUNT(*) | +----------+ | 5356572 | +----------+ 1 row in set (11.83 sec) mysql> explain SELECT COUNT(*) FROM test_item WHERE `statusInt` = 1; +----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+ | 1 | SIMPLE | test_item | ref | statusInt | statusInt | 4 | const | 9010328 | Using index | +----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+ 1 row in set (0.04 sec) mysql> explain SELECT COUNT(*) FROM test_item WHERE `status` = 'active'; +----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+ | 1 | SIMPLE | test_item | ref | status | status | 1 | const | 9010328 | Using where; Using index | +----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+ 1 row in set (0.00 sec)

Description: ENUM is internally stored as a 2-BYTE integer, so an index scan on that column should be faster than an index scan on INT (4-byte integer) column. It's not that case, ENUM is slower (CPU-bound test): mysql> SELECT COUNT(*) FROM test_item WHERE `statusInt` = 1; +----------+ | COUNT(*) | +----------+ | 5356572 | +----------+ 1 row in set (1.13 sec) mysql> SELECT COUNT(*) FROM test_item WHERE `status` = 'active'; +----------+ | COUNT(*) | +----------+ | 5356572 | +----------+ 1 row in set (1.44 sec) The penalty is much worse when the table has more columns with more data. If I am right about the cause of the problem, the penalty will be huge for large tables, when the workload is I/O bound. How to repeat: CREATE TABLE `test_item` ( `itemId` int(10) unsigned NOT NULL, `status` enum('active','gone') NOT NULL, `statusInt` int(10) unsigned NOT NULL, `name1` varchar(255) NOT NULL, `name2` varchar(255) NOT NULL, PRIMARY KEY (`itemId`), KEY `status` (`status`), KEY `statusInt` (`statusInt`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; INSERT of 18 milliion rows, data distribution is like: mysql> SELECT `status`, `statusInt`, COUNT(*) -> FROM test_item -> GROUP BY `status`, `statusInt`; +--------+-----------+----------+ | status | statusInt | COUNT(*) | +--------+-----------+----------+ | active | 1 | 5356572 | | gone | 0 | 12784924 | +--------+-----------+----------+ 2 rows in set (12.27 sec) I can send you gzipped dump of that table, but it's about 660MB. Suggested fix: I assume that the problem is optimizer related, because the ENUM query is not just an index scan, the explain shows "Using where": mysql> EXPLAIN SELECT COUNT(*) FROM test_item WHERE `status` = 'active'; +----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+ | 1 | SIMPLE | test_item | ref | status | status | 1 | const | 8552376 | Using where; Using index | +----+-------------+-----------+------+---------------+--------+---------+-------+---------+--------------------------+ 1 row in set (0.03 sec) mysql> EXPLAIN SELECT COUNT(*) FROM test_item WHERE `statusInt` = 1; +----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+ | 1 | SIMPLE | test_item | ref | statusInt | statusInt | 4 | const | 8552376 | Using index | +----+-------------+-----------+------+---------------+-----------+---------+-------+---------+-------------+ 1 row in set (0.00 sec)