Bug #64791 Mixed String/Int values in IN list cause full table scan
Submitted: 28 Mar 2012 20:17 Modified: 29 Mar 2012 7:15
Reporter: Gavin Towey Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.1.61, 5.5 OS:Any
Assigned to: CPU Architecture:Any

[28 Mar 2012 20:17] Gavin Towey
Description:
Mixed types of values in an IN list of values cause indexes on the column to be ignored.  This can cause a severe performance impact.

SELECT * FROM table WHERE id IN (1,'2')

I'm guessing that this is because mysql decides to cast the column instead of the argument list. It seems like it should be simple to force mysql to cast the argument list to all be the same type, instead of casting the column.  It also seems difficult to imagine a situation where casting the table column is ever the desired result.

How to repeat:
# create test table, with a secondary index on b
create table fts ( id serial primary key, b int, index (b) );

# insert some random data
insert into fts (b) VALUES (RAND()*999);
insert into fts (b) VALUES (RAND()*999);
insert into fts (b) select RAND()*999 from fts;
insert into fts (b) select RAND()*999 from fts;
insert into fts (b) select RAND()*999 from fts;
insert into fts (b) select RAND()*999 from fts;
insert into fts (b) select RAND()*999 from fts;
insert into fts (b) select RAND()*999 from fts;

# IN list with all numbers works as expected
EXPLAIN SELECT * FROM fts WHERE id IN (1,2,3,4,5);
#+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
#| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
#+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
#|  1 | SIMPLE      | fts   | range | PRIMARY,id    | PRIMARY | 8       | NULL |    5 | Using where |
#+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
#1 row in set (0.00 sec)

# Mixed IN list: adding a string causes full table scan
EXPLAIN SELECT * FROM fts WHERE id IN (1,2,3,4,5,'6');
#+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
#| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
#+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
#|  1 | SIMPLE      | fts   | index | PRIMARY,id    | b    | 5       | NULL |  128 | Using where; Using index |
#+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
#1 row in set (0.00 sec)

# All strings are handled correctly!
 EXPLAIN SELECT * FROM fts WHERE id IN ('1','2','3','4','5');
#+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
#| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
#+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
#|  1 | SIMPLE      | fts   | range | PRIMARY,id    | PRIMARY | 8       | NULL |    5 | Using where |
#+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
#1 row in set (0.00 sec)

# adding one number sends it back to full table scan
EXPLAIN SELECT * FROM fts WHERE id IN ('1','2','3','4','5',6);
#+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
#| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
#+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
#|  1 | SIMPLE      | fts   | index | PRIMARY,id    | b    | 5       | NULL |  128 | Using where; Using index |
#+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
#1 row in set (0.00 sec)

Suggested fix:
I think as a general fix, the optimizer should always cast the arguments list of the IN condition to be the same type as the left hand side of the IN.
[29 Mar 2012 7:15] Valeriy Kravchuk
Thank you for the bug report. Verified just as described with older 5.1.61 also.

This is even more strange:

mysql> EXPLAIN EXTENDED SELECT * FROM fts WHERE id IN (1,2,3,'4',5);
+----+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | fts   | ALL  | PRIMARY,id    | NULL | NULL    | NULL |  128 |   100.00 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.04 sec)

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `test`.`fts`.`id` AS `id`,`test`.`fts`.`b` AS `b` from `test`.`fts` where (`test`.`fts`.`id` in (1,2,3,'4',5))
1 row in set (0.01 sec)

mysql> EXPLAIN EXTENDED SELECT * FROM fts WHERE id IN ('1','2','3','4','5');
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | fts   | range | PRIMARY,id    | PRIMARY | 8       | NULL |    5 |   100.00 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.10 sec)

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: select `test`.`fts`.`id` AS `id`,`test`.`fts`.`b` AS `b` from `test`.`fts` where (`test`.`fts`.`id` in (1,2,3,4,5))
1 row in set (0.00 sec)

With all strings in the IN list they are all converted to numbers and index is used as expected. But one string among numbers confuses optimizer...
[30 Oct 2013 13:26] Vlad Fratila
This is still happening on 5.6.13. Any resolution or updates of any kind?
[9 Jul 2015 8:35] Damian Zajkowski
Still exist in 5.6.21