Bug #83398 Slow and unexpected explain output on FTS
Submitted: 16 Oct 2016 11:09 Modified: 17 Feb 2017 18:00
Reporter: Daniël van Eeden (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.7.15 OS:Any
Assigned to: CPU Architecture:Any
Tags: explain, fulltext, NGRAM

[16 Oct 2016 11:09] Daniël van Eeden
Description:
An explain takes 6.37 secs and says "Select tables optimized away".
The query w/ fulltext matching takes 6.43 secs.
The query w/o fulltext matching takes 0.31 secs.

To me it looks like:
1. The explain output is wrong
2. The optimizer takes too much time before actually executing the query
3. The resulting plan is still fast

How to repeat:
mysql-5.7.15 [test] > explain format=json SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE)\G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "message": "Select tables optimized away"
  }
}
1 row in set, 1 warning (6.37 sec)

mysql-5.7.15 [test] > explain SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE);
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                        |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | Select tables optimized away |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
1 row in set, 1 warning (6.48 sec)

mysql-5.7.15 [test] > show warnings;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                    |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select count(0) AS `count(*)` from `test`.`phone` where (match `test`.`phone`.`phonenr` against ('638586' in boolean mode)) |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql-5.7.15 [test] > SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE);
+----------+
| count(*) |
+----------+
|     2046 |
+----------+
1 row in set (6.43 sec)

mysql-5.7.15 [test] > SELECT count(*) FROM phone;
+----------+
| count(*) |
+----------+
|   749641 |
+----------+
1 row in set (0.16 sec)

mysql-5.7.15 [test] > show create table phone\G
*************************** 1. row ***************************
       Table: phone
Create Table: CREATE TABLE `phone` (
  `FTS_DOC_ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `phonenr` varchar(255) NOT NULL,
  PRIMARY KEY (`FTS_DOC_ID`),
  FULLTEXT KEY `ngram_phone` (`phonenr`) /*!50100 WITH PARSER `ngram` */ 
) ENGINE=InnoDB AUTO_INCREMENT=768514 DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

mysql-5.7.15 [test] > SELECT count(*) FROM phone WHERE phonenr LIKE '%638586%';
+----------+
| count(*) |
+----------+
|     2046 |
+----------+
1 row in set (0.31 sec)

mysql-5.7.15 [test] > explain SELECT count(*) FROM phone WHERE phonenr LIKE '%638586%';
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | phone | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 748019 |    11.11 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
[16 Oct 2016 11:30] Daniël van Eeden
With innodb_ft_enable_diag_print=on

the explain:
2016-10-16T11:17:13.765133Z 3 [Note] InnoDB: keynr=1, '638586
2016-10-16T11:17:13.765160Z 3 [Note] InnoDB: BOOL search
2016-10-16T11:17:20.207356Z 3 [Note] InnoDB: '38' -> 748019/52854 1.1508
2016-10-16T11:17:20.208402Z 3 [Note] InnoDB: '58' -> 748019/70882 1.0234
2016-10-16T11:17:20.208982Z 3 [Note] InnoDB: '63' -> 748019/87008 0.93435
2016-10-16T11:17:20.208994Z 3 [Note] InnoDB: '85' -> 748019/89702 0.92111
2016-10-16T11:17:20.208999Z 3 [Note] InnoDB: '86' -> 748019/59772 1.0974
2016-10-16T11:17:20.215657Z 3 [Note] InnoDB: FTS Search Processing time: 6 secs: 450 millisec: row(s) 2046
2016-10-16T11:17:20.215668Z 3 [Note] InnoDB: Full Search Memory: 227814834 (bytes),  Row: 2046.

query execution:
2016-10-16T11:17:40.855147Z 3 [Note] InnoDB: keynr=1, '638586
2016-10-16T11:17:40.855188Z 3 [Note] InnoDB: BOOL search
2016-10-16T11:17:47.303967Z 3 [Note] InnoDB: '38' -> 748019/52854 1.1508
2016-10-16T11:17:47.305010Z 3 [Note] InnoDB: '58' -> 748019/70882 1.0234
2016-10-16T11:17:47.305587Z 3 [Note] InnoDB: '63' -> 748019/87008 0.93435
2016-10-16T11:17:47.305597Z 3 [Note] InnoDB: '85' -> 748019/89702 0.92111
2016-10-16T11:17:47.305602Z 3 [Note] InnoDB: '86' -> 748019/59772 1.0974
2016-10-16T11:17:47.312235Z 3 [Note] InnoDB: FTS Search Processing time: 6 secs: 457 millisec: row(s) 2046
2016-10-16T11:17:47.312245Z 3 [Note] InnoDB: Full Search Memory: 227814834 (bytes),  Row: 2046.

If I change the query from a count(*) to non-aggregate:

mysql-5.7.15 [test] > explain SELECT name,phonenr FROM phone WHERE MATCH(phonenr) AGAINST('638586');
+----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-------------------------------+
| id | select_type | table | partitions | type     | possible_keys | key         | key_len | ref   | rows | filtered | Extra                         |
+----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-------------------------------+
|  1 | SIMPLE      | phone | NULL       | fulltext | ngram_phone   | ngram_phone | 0       | const |    1 |   100.00 | Using where; Ft_hints: sorted |
+----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-------------------------------+
1 row in set, 1 warning (0.56 sec)

2016-10-16T11:21:55.998459Z 3 [Note] InnoDB: keynr=1, '638586
2016-10-16T11:21:55.998565Z 3 [Note] InnoDB: NL search
2016-10-16T11:21:56.384664Z 3 [Note] InnoDB: '38' -> 748019/52854 1.1508
2016-10-16T11:21:56.384694Z 3 [Note] InnoDB: '58' -> 748019/70882 1.0234
2016-10-16T11:21:56.384702Z 3 [Note] InnoDB: '63' -> 748019/87008 0.93435
2016-10-16T11:21:56.384708Z 3 [Note] InnoDB: '85' -> 748019/89702 0.92111
2016-10-16T11:21:56.384713Z 3 [Note] InnoDB: '86' -> 748019/59772 1.0974
2016-10-16T11:21:56.481549Z 3 [Note] InnoDB: FTS Search Processing time: 0 secs: 483 millisec: row(s) 273084
2016-10-16T11:21:56.481576Z 3 [Note] InnoDB: Full Search Memory: 40928154 (bytes),  Row: 273084.

This is much faster and the explain plan looks more correct.
However explain says rows=1 and the debug output says rows=273084, which matches the actual number of returned rows.
[16 Oct 2016 11:31] Daniël van Eeden
Optimizer trace output:

mysql-5.7.15 [test] > explain SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE);
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                        |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | Select tables optimized away |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
1 row in set, 1 warning (6.37 sec)

mysql-5.7.15 [test] > select * from information_schema.optimizer_trace\G
*************************** 1. row ***************************
                            QUERY: explain SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE)
                            TRACE: {
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select count(0) AS `count(*)` from `phone` where (match `phone`.`phonenr` against ('638586' in boolean mode))"
          }
        ]
      }
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "condition_processing": {
              "condition": "WHERE",
              "original_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))"
                }
              ]
            }
          }
        ],
        "empty_result": {
          "cause": "Select tables optimized away"
        }
      }
    },
    {
      "join_explain": {
        "select#": 1,
        "steps": [
        ]
      }
    }
  ]
}
MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0
          INSUFFICIENT_PRIVILEGES: 0
1 row in set (0.00 sec)

mysql-5.7.15 [test] > explain SELECT name,phonenr FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE);
+----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-----------------------------------+
| id | select_type | table | partitions | type     | possible_keys | key         | key_len | ref   | rows | filtered | Extra                             |
+----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-----------------------------------+
|  1 | SIMPLE      | phone | NULL       | fulltext | ngram_phone   | ngram_phone | 0       | const |    1 |   100.00 | Using where; Ft_hints: no_ranking |
+----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-----------------------------------+
1 row in set, 1 warning (6.37 sec)

mysql-5.7.15 [test] > select * from information_schema.optimizer_trace\G
*************************** 1. row ***************************
                            QUERY: explain SELECT name,phonenr FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE)
                            TRACE: {
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `phone`.`name` AS `name`,`phone`.`phonenr` AS `phonenr` from `phone` where (match `phone`.`phonenr` against ('638586' in boolean mode))"
          }
        ]
      }
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "condition_processing": {
              "condition": "WHERE",
              "original_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))"
                }
              ]
            }
          },
          {
            "substitute_generated_columns": {
            }
          },
          {
            "table_dependencies": [
              {
                "table": "`phone`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ]
              }
            ]
          },
          {
            "ref_optimizer_key_uses": [
              {
                "table": "`phone`",
                "field": "<fulltext>",
                "equals": "(match `phone`.`phonenr` against ('638586' in boolean mode))",
                "null_rejecting": false
              }
            ]
          },
          {
            "rows_estimation": [
              {
                "table": "`phone`",
                "table_scan": {
                  "rows": 748019,
                  "cost": 2148
                }
              }
            ]
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "table": "`phone`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "access_type": "fulltext",
                      "index": "ngram_phone",
                      "rows": 1,
                      "cost": 1.2,
                      "chosen": true
                    },
                    {
                      "access_type": "scan",
                      "cost": 151752,
                      "rows": 748019,
                      "chosen": false,
                      "cause": "cost"
                    }
                  ]
                },
                "condition_filtering_pct": 100,
                "rows_for_plan": 1,
                "cost_for_plan": 1.2,
                "chosen": true
              }
            ]
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))",
              "attached_conditions_computation": [
              ],
              "attached_conditions_summary": [
                {
                  "table": "`phone`",
                  "attached": "(match `phone`.`phonenr` against ('638586' in boolean mode))"
                }
              ]
            }
          },
          {
            "refine_plan": [
              {
                "table": "`phone`"
              }
            ]
          }
        ]
      }
    },
    {
      "join_explain": {
        "select#": 1,
        "steps": [
        ]
      }
    }
  ]
}
MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0
          INSUFFICIENT_PRIVILEGES: 0
1 row in set (0.01 sec)
[1 Nov 2016 14:12] MySQL Verification Team
Hi!

We do need some further info from you before we can proceed. 

* What storage engine did you use ??

* Why are you using "IN BOOLEAN MODE" over such a simple search ??

* Is the EXPLAIN duration same with and without "IN BOOLEAN MODE"

* Why do you consider 6.37 seconds slow for a full text search ...

* In order to repeat the problem we need a full dump of the entire `phone` table

Thanks in advance.
[2 Dec 2016 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[11 Dec 2016 11:23] Daniël van Eeden
mysql> show create table phone\G
*************************** 1. row ***************************
       Table: phone
Create Table: CREATE TABLE `phone` (
  `FTS_DOC_ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `phonenr` varchar(255) NOT NULL,
  PRIMARY KEY (`FTS_DOC_ID`),
  FULLTEXT KEY `ngram_phone` (`phonenr`) /*!50100 WITH PARSER `ngram` */ 
) ENGINE=InnoDB AUTO_INCREMENT=768514 DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

mysql> SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE);
+----------+
| count(*) |
+----------+
|     2046 |
+----------+
1 row in set (6.42 sec)

mysql> SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586');
+----------+
| count(*) |
+----------+
|   273084 |
+----------+
1 row in set (0.48 sec)
[11 Dec 2016 11:24] Daniël van Eeden
dump file

Attachment: bug83398_phone.sql.xz (application/x-xz, text), 879.97 KiB.

[11 Dec 2016 11:32] Daniël van Eeden
* What storage engine did you use ??
InnoDB

* Why are you using "IN BOOLEAN MODE" over such a simple search ??
No particular reason.

* Is the EXPLAIN duration same with and without "IN BOOLEAN MODE"
No, this only happens w/ boolean mode.

* Why do you consider 6.37 seconds slow for a full text search ...
That's because for most queries explain is fast. It looks like MySQL either takes too much time to find the best plan or it executes parts of the plan during the planning stage.

* In order to repeat the problem we need a full dump of the entire `phone` table
I've attached that.

What this bug is about:
1. I think this explain is too slow.
2. I think the explain output doesn't very well indicate why it takes this amount of time to execute (plan+execute) the query. And the optimizer trace also doesn't provide much information about this. 

mysql-5.7.15-log [test] > explain SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE);
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                        |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | Select tables optimized away |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
1 row in set, 1 warning (5.93 sec)

mysql-5.7.15-log [test] > explain SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586');
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                        |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | Select tables optimized away |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
1 row in set, 1 warning (0.48 sec)
[11 Dec 2016 12:18] Daniël van Eeden
Related:
Bug #84158 	Loading table with fulltext from mysqldump fails
[12 Dec 2016 15:25] MySQL Verification Team
Hi,

This is a known phenomena, only not explain fully in the documentation, which is why  I verify this bug as a documentation bug.

You know, full text expressions are evaluated in optimization phase.  So the matching will also happen during EXPLAIN, as well. IN BOOLEAN MODE will be ALWAYS slower, since it works like a degenerated natural language.

"Select tables optimized away" is hence an expected behavior, since all matching happens in the optimizer phase, which is fully executed during EXPLAIN.

But, as I already wrote, it has to be documented better.
[17 Feb 2017 18:00] Paul DuBois
Posted by developer:
 
Added to https://dev.mysql.com/doc/refman/5.7/en/column-indexes.html#column-indexes-fulltext

For queries that contain full-text expressions, MySQL evaluates those expressions during the optimization phase of query execution. The optimizer does not just look at full-text expressions and make estimates, it actually evaluates them in the process of developing an execution plan.

An implication of this behavior is that EXPLAIN for full-text queries is typically slower than for non-full-text queries for which no expression evaluation occurs during the optimization phase.

EXPLAIN for full-text queries may show Select tables optimized away in the Extra column due to matching occurring during optimization; in this case, no table access need occur during later execution.