Bug #57216 | InnoDB column prefix index not treated as a covering one | ||
---|---|---|---|
Submitted: | 4 Oct 2010 12:59 | Modified: | 8 Oct 2010 20:08 |
Reporter: | Marko Mäkelä | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.1, 5.6.99 bzr | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | column prefix, innodb |
[4 Oct 2010 12:59]
Marko Mäkelä
[8 Oct 2010 20:08]
Sveta Smirnova
Problem exists in next-mr. I set breakpoints in mysql_parse and row_search_for_mysql, then checked prebuilt->need_to_access_clustered. It was always 1.
[14 Dec 2010 16:50]
Trey Raymond
Similar situation here. CREATE TABLE `users` ( `user_id` bigint(20) NOT NULL DEFAULT '0', `gender` binary(1) NOT NULL DEFAULT 'U', `age` tinyint(3) unsigned NOT NULL DEFAULT '0', `msa_id` smallint(5) unsigned NOT NULL DEFAULT '0', `region_id` smallint(5) unsigned NOT NULL DEFAULT '0', `last_seen` date NOT NULL, `segments` varbinary(1000) DEFAULT NULL, PRIMARY KEY (`user_id`), KEY `last_seen` (`last_seen`), KEY `segments` (`segments`(1)) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT Can't get it to just use the index using either of these methods, searching for rows with blank segments field, tried with blank string or is null... mysql> explain select users.user_id FROM dare.users FORCE INDEX (`segments`) WHERE users.segments !='';+----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+ | 1 | SIMPLE | users | range | segments | segments | 4 | NULL | 1 | Using where | +----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+ 1 row in set (0.00 sec) mysql> explain select users.user_id FROM dare.users FORCE INDEX (`segments`) WHERE users.segments is not null; +----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+ | 1 | SIMPLE | users | range | segments | segments | 4 | NULL | 1 | Using where | +----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+ 1 row in set (0.00 sec) this table has about 350 million rows, and that data page lookup makes such a query take hours (without the force index, it does a full table scan, which is awful as at any given time only about 5% of them will have data there). This should be a very fast query, just pulling 10-15m records directly from the index dictionary.
[26 Jan 2011 17:14]
Rick James
I would vote for having code to handle Trey's test for NULL. As a side issue, I wonder if, when doing the table scan, InnoDB might unnecessarily be fetching the entire field (past 767 bytes, into the extension) when checking for NULL? But I would warn you that implementing Marko's case (such LEFT(), LIKE), can lead to collation errors. In uft8_unicode_ci, there could be a multi-character (not just multi-byte) sequence that is broken by the prefix, making it impossible to tell where the record really belongs. A few other non-"_bin" collations also suffer from this complication, so I suggest looking at the collation to decide whether to make any attempt at using the prefix. At that point, you may as well disallow creating a prefix index on any field with a collation that needs to compare multi-character (not just multi-byte) sequences to determine equality.