Bug #95058 Index not used for column with IS TRUE or IS FALSE operators
Submitted: 18 Apr 2019 18:12 Modified: 25 Apr 2019 2:29
Reporter: monty solomon Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.7.21 OS:Any
Assigned to: CPU Architecture:Any

[18 Apr 2019 18:12] monty solomon
Description:
When a statement uses IS TRUE or IS FALSE instead of using =1 or =0 the indexed column is not used in the index and performance is degraded.

How to repeat:
CREATE TABLE istest (
  a int NOT NULL AUTO_INCREMENT,
  b int NOT NULL,
  c boolean NOT NULL,
  PRIMARY KEY (a),
  KEY (b, c)
) ENGINE=InnoDB;

INSERT INTO istest (b,c) VALUES (1,0), (1,1), (2,0), (2,1), (3,0), (3,1), (4,0), (4,1), (5,0), (5,1), (6,0), (6,1); 

Observe the change in the key_len in the EXPLAIN output when =1 is used instead of IS TRUE and the change in Extra from Using where; Using index to Using index.

EXPLAIN SELECT * FROM istest WHERE b=2 AND c IS TRUE\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: istest
   partitions: NULL
         type: ref
possible_keys: b
          key: b
      key_len: 4
          ref: const
         rows: 2
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

EXPLAIN SELECT * FROM istest WHERE b=2 AND c = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: istest
   partitions: NULL
         type: ref
possible_keys: b
          key: b
      key_len: 5
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: Using index
1 row in set, 1 warning (0.00 sec)

Observe the change in the key_len in the EXPLAIN output when =1 is used instead of IS FALSE and the change in Extra from Using where; Using index to Using index.

EXPLAIN SELECT * FROM istest WHERE b=2 AND c IS FALSE\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: istest
   partitions: NULL
         type: ref
possible_keys: b
          key: b
      key_len: 4
          ref: const
         rows: 2
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

EXPLAIN SELECT * FROM istest WHERE b=2 AND c = 0\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: istest
   partitions: NULL
         type: ref
possible_keys: b
          key: b
      key_len: 5
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: Using index
1 row in set, 1 warning (0.00 sec)

Suggested fix:
Update the optimizer to use indexed columns with the IS operator.
[18 Apr 2019 18:43] monty solomon
It does use the indexed column for IS UNKNOWN

ALTER TABLE istest MODIFY c boolean NULL;

INSERT INTO istest (b,c) VALUES (2,3), (2,4), (2, NULL);

EXPLAIN SELECT * FROM istest WHERE b=2 AND c IS UNKNOWN\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: istest
   partitions: NULL
         type: ref
possible_keys: b
          key: b
      key_len: 6
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

EXPLAIN FORMAT=JSON SELECT * FROM istest WHERE b=2 AND c IS UNKNOWN\G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "1.20"
    },
    "table": {
      "table_name": "istest",
      "access_type": "ref",
      "possible_keys": [
        "b"
      ],
      "key": "b",
      "used_key_parts": [
        "b",
        "c"
      ],
      "key_length": "6",
      "ref": [
        "const",
        "const"
      ],
      "rows_examined_per_scan": 1,
      "rows_produced_per_join": 1,
      "filtered": "100.00",
      "using_index": true,
      "cost_info": {
        "read_cost": "1.00",
        "eval_cost": "0.20",
        "prefix_cost": "1.20",
        "data_read_per_join": "16"
      },
      "used_columns": [
        "a",
        "b",
        "c"
      ],
      "attached_condition": "isnull(`test`.`istest`.`c`)"
    }
  }
}
1 row in set, 1 warning (0.00 sec)
[18 Apr 2019 18:48] monty solomon
Using !=0 instead of IS TRUE uses the additional column from the index. Here is the EXPLAIN output.

EXPLAIN SELECT * FROM istest WHERE b=2 AND c != 0\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: istest
   partitions: NULL
         type: range
possible_keys: b
          key: b
      key_len: 6
          ref: NULL
         rows: 4
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

EXPLAIN SELECT * FROM istest WHERE b=2 AND c IS TRUE\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: istest
   partitions: NULL
         type: ref
possible_keys: b
          key: b
      key_len: 4
          ref: const
         rows: 5
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

EXPLAIN FORMAT=JSON SELECT * FROM istest WHERE b=2 AND c != 0\G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "1.80"
    },
    "table": {
      "table_name": "istest",
      "access_type": "range",
      "possible_keys": [
        "b"
      ],
      "key": "b",
      "used_key_parts": [
        "b",
        "c"
      ],
      "key_length": "6",
      "rows_examined_per_scan": 4,
      "rows_produced_per_join": 3,
      "filtered": "100.00",
      "using_index": true,
      "cost_info": {
        "read_cost": "1.00",
        "eval_cost": "0.72",
        "prefix_cost": "1.80",
        "data_read_per_join": "57"
      },
      "used_columns": [
        "a",
        "b",
        "c"
      ],
      "attached_condition": "((`test`.`istest`.`b` = 2) and (`test`.`istest`.`c` <> 0))"
    }
  }
}
1 row in set, 1 warning (0.00 sec)
[22 Apr 2019 10:33] MySQL Verification Team
Hi,
Thanks for the report. Verified as described.

all best
bogdan
[23 Apr 2019 14:04] Roy Lyseng
Notice first that MySQL does not have a native BOOLEAN type. BOOLEAN is internally substituted with TINYINT, so this is actually an integer with range -128..127.

Thus, the predicate
  c IS TRUE
is actually the same as
  (c <> 0) IS TRUE,
 and the predicate
  c IS FALSE
is the same as
  (c = 0) IS TRUE.

You will clearly see that using an index for c IS TRUE is difficult, at least if there is no histogram defined over the column. However, it should be possible to use an index for c IS FALSE, since it can be transformed internally to an equality against the value 0 (zero).

But apparently MySQL does not use indexes at all when IS TRUE or IS FALSE is appended to the predicate, and this looks like a bug.
[25 Apr 2019 2:20] monty solomon
Even though there isn't a boolean data type, the documentation describes using IS to test a value against a boolean value.

https://dev.mysql.com/doc/refman/5.7/en/comparison-operators.html#operator_is

https://dev.mysql.com/doc/refman/5.7/en/comparison-operators.html#operator_is-not
[25 Apr 2019 2:29] monty solomon
The index is not used for

EXPLAIN SELECT * FROM istest WHERE b=2 AND c

or for

EXPLAIN SELECT * FROM istest WHERE b=2 AND c IS TRUE

but is used for

EXPLAIN SELECT * FROM istest WHERE b=2 AND c != 0

The index is not used for

EXPLAIN SELECT * FROM istest WHERE b=2 AND !c

or for

EXPLAIN SELECT * FROM istest WHERE b=2 AND c IS FALSE

but is used for

EXPLAIN SELECT * FROM istest WHERE b=2 AND c = 0