Bug #54179 Index is not used for range scan on SET field
Submitted: 2 Jun 2010 11:37
Reporter: Rene' Cannao' Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.37 , 5.1.47 , 5.5.4 OS:Any
Assigned to: CPU Architecture:Any
Triage: Triaged: D3 (Medium)

[2 Jun 2010 11:37] Rene' Cannao'
Description:
MySQL stores values in SET fields as numeric values.
Although, for indexed SET fields the optimizer is not able to perform a range scan.

How to repeat:
Simple test case:

CREATE TABLE tbl_set (my_set SET('a','b','c','d','e'), INDEX (my_set));
INSERT INTO tbl_set VALUES ('a,b'),('a,c'),('a,b'),('a,c'),('a,b'),('a,c'),('a,c'),('a,e');
INSERT INTO tbl_set VALUES ('a,b,c'),('a,b,c'),('a,b,c'),('a,b,e'),('a,c,e'),('b,c,e');
INSERT INTO tbl_set VALUES ('a,b,c,d');
INSERT INTO tbl_set VALUES ('d,e');
INSERT INTO tbl_set SELECT * FROM tbl_set;
INSERT INTO tbl_set SELECT * FROM tbl_set;
INSERT INTO tbl_set SELECT * FROM tbl_set;

mysql> SELECT COUNT(*) FROM tbl_set WHERE my_set>=8;
+----------+
| COUNT(*) |
+----------+
|       48 |
+----------+
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT COUNT(*) FROM tbl_set WHERE my_set>=8\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tbl_set
         type: index
possible_keys: my_set
          key: my_set
      key_len: 2
          ref: NULL
         rows: 128
        Extra: Using where; Using index
1 row in set (0.00 sec)

The same applies for numeric context:
mysql> EXPLAIN SELECT COUNT(*) FROM tbl_set WHERE my_set+0>=8\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: tbl_set
         type: index
possible_keys: NULL
          key: my_set
      key_len: 2
          ref: NULL
         rows: 128
        Extra: Using where; Using index
1 row in set (0.00 sec)

The optimizer uses the index only as covering index, not to perform a range scan.

Suggested fix:
The optimizer should treat the SET value numerically, and therefore perform a range scan.
[2 Jun 2010 11:40] Rene' Cannao'
If the optimizer performs a range scan will be possible to replace queries like:
SELECT * FROM tbl_set WHERE FIND_IN_SET('d',my_set) > 0;
In queries like:
SELECT * FROM tbl_set WHERE my_set>=8 AND my_set%16>=8;