Bug #83398 | Slow and unexpected explain output on FTS | ||
---|---|---|---|
Submitted: | 16 Oct 2016 11:09 | Modified: | 17 Feb 2017 18:00 |
Reporter: | Daniël van Eeden (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Documentation | Severity: | S3 (Non-critical) |
Version: | 5.7.15 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | explain, fulltext, NGRAM |
[16 Oct 2016 11:09]
Daniël van Eeden
[16 Oct 2016 11:30]
Daniël van Eeden
With innodb_ft_enable_diag_print=on the explain: 2016-10-16T11:17:13.765133Z 3 [Note] InnoDB: keynr=1, '638586 2016-10-16T11:17:13.765160Z 3 [Note] InnoDB: BOOL search 2016-10-16T11:17:20.207356Z 3 [Note] InnoDB: '38' -> 748019/52854 1.1508 2016-10-16T11:17:20.208402Z 3 [Note] InnoDB: '58' -> 748019/70882 1.0234 2016-10-16T11:17:20.208982Z 3 [Note] InnoDB: '63' -> 748019/87008 0.93435 2016-10-16T11:17:20.208994Z 3 [Note] InnoDB: '85' -> 748019/89702 0.92111 2016-10-16T11:17:20.208999Z 3 [Note] InnoDB: '86' -> 748019/59772 1.0974 2016-10-16T11:17:20.215657Z 3 [Note] InnoDB: FTS Search Processing time: 6 secs: 450 millisec: row(s) 2046 2016-10-16T11:17:20.215668Z 3 [Note] InnoDB: Full Search Memory: 227814834 (bytes), Row: 2046. query execution: 2016-10-16T11:17:40.855147Z 3 [Note] InnoDB: keynr=1, '638586 2016-10-16T11:17:40.855188Z 3 [Note] InnoDB: BOOL search 2016-10-16T11:17:47.303967Z 3 [Note] InnoDB: '38' -> 748019/52854 1.1508 2016-10-16T11:17:47.305010Z 3 [Note] InnoDB: '58' -> 748019/70882 1.0234 2016-10-16T11:17:47.305587Z 3 [Note] InnoDB: '63' -> 748019/87008 0.93435 2016-10-16T11:17:47.305597Z 3 [Note] InnoDB: '85' -> 748019/89702 0.92111 2016-10-16T11:17:47.305602Z 3 [Note] InnoDB: '86' -> 748019/59772 1.0974 2016-10-16T11:17:47.312235Z 3 [Note] InnoDB: FTS Search Processing time: 6 secs: 457 millisec: row(s) 2046 2016-10-16T11:17:47.312245Z 3 [Note] InnoDB: Full Search Memory: 227814834 (bytes), Row: 2046. If I change the query from a count(*) to non-aggregate: mysql-5.7.15 [test] > explain SELECT name,phonenr FROM phone WHERE MATCH(phonenr) AGAINST('638586'); +----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-------------------------------+ | 1 | SIMPLE | phone | NULL | fulltext | ngram_phone | ngram_phone | 0 | const | 1 | 100.00 | Using where; Ft_hints: sorted | +----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-------------------------------+ 1 row in set, 1 warning (0.56 sec) 2016-10-16T11:21:55.998459Z 3 [Note] InnoDB: keynr=1, '638586 2016-10-16T11:21:55.998565Z 3 [Note] InnoDB: NL search 2016-10-16T11:21:56.384664Z 3 [Note] InnoDB: '38' -> 748019/52854 1.1508 2016-10-16T11:21:56.384694Z 3 [Note] InnoDB: '58' -> 748019/70882 1.0234 2016-10-16T11:21:56.384702Z 3 [Note] InnoDB: '63' -> 748019/87008 0.93435 2016-10-16T11:21:56.384708Z 3 [Note] InnoDB: '85' -> 748019/89702 0.92111 2016-10-16T11:21:56.384713Z 3 [Note] InnoDB: '86' -> 748019/59772 1.0974 2016-10-16T11:21:56.481549Z 3 [Note] InnoDB: FTS Search Processing time: 0 secs: 483 millisec: row(s) 273084 2016-10-16T11:21:56.481576Z 3 [Note] InnoDB: Full Search Memory: 40928154 (bytes), Row: 273084. This is much faster and the explain plan looks more correct. However explain says rows=1 and the debug output says rows=273084, which matches the actual number of returned rows.
[16 Oct 2016 11:31]
Daniël van Eeden
Optimizer trace output: mysql-5.7.15 [test] > explain SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE); +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+ | 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+ 1 row in set, 1 warning (6.37 sec) mysql-5.7.15 [test] > select * from information_schema.optimizer_trace\G *************************** 1. row *************************** QUERY: explain SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE) TRACE: { "steps": [ { "join_preparation": { "select#": 1, "steps": [ { "expanded_query": "/* select#1 */ select count(0) AS `count(*)` from `phone` where (match `phone`.`phonenr` against ('638586' in boolean mode))" } ] } }, { "join_optimization": { "select#": 1, "steps": [ { "condition_processing": { "condition": "WHERE", "original_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))", "steps": [ { "transformation": "equality_propagation", "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))" }, { "transformation": "constant_propagation", "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))" }, { "transformation": "trivial_condition_removal", "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))" } ] } } ], "empty_result": { "cause": "Select tables optimized away" } } }, { "join_explain": { "select#": 1, "steps": [ ] } } ] } MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0 INSUFFICIENT_PRIVILEGES: 0 1 row in set (0.00 sec) mysql-5.7.15 [test] > explain SELECT name,phonenr FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE); +----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-----------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-----------------------------------+ | 1 | SIMPLE | phone | NULL | fulltext | ngram_phone | ngram_phone | 0 | const | 1 | 100.00 | Using where; Ft_hints: no_ranking | +----+-------------+-------+------------+----------+---------------+-------------+---------+-------+------+----------+-----------------------------------+ 1 row in set, 1 warning (6.37 sec) mysql-5.7.15 [test] > select * from information_schema.optimizer_trace\G *************************** 1. row *************************** QUERY: explain SELECT name,phonenr FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE) TRACE: { "steps": [ { "join_preparation": { "select#": 1, "steps": [ { "expanded_query": "/* select#1 */ select `phone`.`name` AS `name`,`phone`.`phonenr` AS `phonenr` from `phone` where (match `phone`.`phonenr` against ('638586' in boolean mode))" } ] } }, { "join_optimization": { "select#": 1, "steps": [ { "condition_processing": { "condition": "WHERE", "original_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))", "steps": [ { "transformation": "equality_propagation", "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))" }, { "transformation": "constant_propagation", "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))" }, { "transformation": "trivial_condition_removal", "resulting_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))" } ] } }, { "substitute_generated_columns": { } }, { "table_dependencies": [ { "table": "`phone`", "row_may_be_null": false, "map_bit": 0, "depends_on_map_bits": [ ] } ] }, { "ref_optimizer_key_uses": [ { "table": "`phone`", "field": "<fulltext>", "equals": "(match `phone`.`phonenr` against ('638586' in boolean mode))", "null_rejecting": false } ] }, { "rows_estimation": [ { "table": "`phone`", "table_scan": { "rows": 748019, "cost": 2148 } } ] }, { "considered_execution_plans": [ { "plan_prefix": [ ], "table": "`phone`", "best_access_path": { "considered_access_paths": [ { "access_type": "fulltext", "index": "ngram_phone", "rows": 1, "cost": 1.2, "chosen": true }, { "access_type": "scan", "cost": 151752, "rows": 748019, "chosen": false, "cause": "cost" } ] }, "condition_filtering_pct": 100, "rows_for_plan": 1, "cost_for_plan": 1.2, "chosen": true } ] }, { "attaching_conditions_to_tables": { "original_condition": "(match `phone`.`phonenr` against ('638586' in boolean mode))", "attached_conditions_computation": [ ], "attached_conditions_summary": [ { "table": "`phone`", "attached": "(match `phone`.`phonenr` against ('638586' in boolean mode))" } ] } }, { "refine_plan": [ { "table": "`phone`" } ] } ] } }, { "join_explain": { "select#": 1, "steps": [ ] } } ] } MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0 INSUFFICIENT_PRIVILEGES: 0 1 row in set (0.01 sec)
[1 Nov 2016 14:12]
MySQL Verification Team
Hi! We do need some further info from you before we can proceed. * What storage engine did you use ?? * Why are you using "IN BOOLEAN MODE" over such a simple search ?? * Is the EXPLAIN duration same with and without "IN BOOLEAN MODE" * Why do you consider 6.37 seconds slow for a full text search ... * In order to repeat the problem we need a full dump of the entire `phone` table Thanks in advance.
[2 Dec 2016 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[11 Dec 2016 11:23]
Daniël van Eeden
mysql> show create table phone\G *************************** 1. row *************************** Table: phone Create Table: CREATE TABLE `phone` ( `FTS_DOC_ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `name` varchar(255) NOT NULL, `phonenr` varchar(255) NOT NULL, PRIMARY KEY (`FTS_DOC_ID`), FULLTEXT KEY `ngram_phone` (`phonenr`) /*!50100 WITH PARSER `ngram` */ ) ENGINE=InnoDB AUTO_INCREMENT=768514 DEFAULT CHARSET=utf8mb4 1 row in set (0.00 sec) mysql> SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE); +----------+ | count(*) | +----------+ | 2046 | +----------+ 1 row in set (6.42 sec) mysql> SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586'); +----------+ | count(*) | +----------+ | 273084 | +----------+ 1 row in set (0.48 sec)
[11 Dec 2016 11:24]
Daniël van Eeden
dump file
Attachment: bug83398_phone.sql.xz (application/x-xz, text), 879.97 KiB.
[11 Dec 2016 11:32]
Daniël van Eeden
* What storage engine did you use ?? InnoDB * Why are you using "IN BOOLEAN MODE" over such a simple search ?? No particular reason. * Is the EXPLAIN duration same with and without "IN BOOLEAN MODE" No, this only happens w/ boolean mode. * Why do you consider 6.37 seconds slow for a full text search ... That's because for most queries explain is fast. It looks like MySQL either takes too much time to find the best plan or it executes parts of the plan during the planning stage. * In order to repeat the problem we need a full dump of the entire `phone` table I've attached that. What this bug is about: 1. I think this explain is too slow. 2. I think the explain output doesn't very well indicate why it takes this amount of time to execute (plan+execute) the query. And the optimizer trace also doesn't provide much information about this. mysql-5.7.15-log [test] > explain SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586' IN BOOLEAN MODE); +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+ | 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+ 1 row in set, 1 warning (5.93 sec) mysql-5.7.15-log [test] > explain SELECT count(*) FROM phone WHERE MATCH(phonenr) AGAINST('638586'); +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+ | 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+ 1 row in set, 1 warning (0.48 sec)
[11 Dec 2016 12:18]
Daniël van Eeden
Related: Bug #84158 Loading table with fulltext from mysqldump fails
[12 Dec 2016 15:25]
MySQL Verification Team
Hi, This is a known phenomena, only not explain fully in the documentation, which is why I verify this bug as a documentation bug. You know, full text expressions are evaluated in optimization phase. So the matching will also happen during EXPLAIN, as well. IN BOOLEAN MODE will be ALWAYS slower, since it works like a degenerated natural language. "Select tables optimized away" is hence an expected behavior, since all matching happens in the optimizer phase, which is fully executed during EXPLAIN. But, as I already wrote, it has to be documented better.
[17 Feb 2017 18:00]
Paul DuBois
Posted by developer: Added to https://dev.mysql.com/doc/refman/5.7/en/column-indexes.html#column-indexes-fulltext For queries that contain full-text expressions, MySQL evaluates those expressions during the optimization phase of query execution. The optimizer does not just look at full-text expressions and make estimates, it actually evaluates them in the process of developing an execution plan. An implication of this behavior is that EXPLAIN for full-text queries is typically slower than for non-full-text queries for which no expression evaluation occurs during the optimization phase. EXPLAIN for full-text queries may show Select tables optimized away in the Extra column due to matching occurring during optimization; in this case, no table access need occur during later execution.