Bug #87518 Server asserts when using n-gram full text searching with LIMIT
Submitted: 23 Aug 2017 19:06 Modified: 23 Aug 2017 19:50
Reporter: Yura Sorokin (OCA) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: FULLTEXT search Severity:S6 (Debug Builds)
Version:5.7.19, 8.0.2 OS:Any
Assigned to: CPU Architecture:Any

[23 Aug 2017 19:06] Yura Sorokin
Description:
Running NGRAM FULL TEXT SEARCH select queries with LIMIT clause may result in incorrect result set / server assertion in 'fts_query_calculate_ranking()'.

***************************************
#0  0x00007f1ce1c17741 in pthread_kill () from /lib64/libpthread.so.0
#0  0x00007f1ce1c17741 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000001836ef0 in my_write_core (sig=6) at /mnt/hgfs/repos/percona-server/mysys/stacktrace.c:249
#2  0x0000000000e7d8b2 in handle_fatal_signal (sig=6) at /mnt/hgfs/repos/percona-server/sql/signal_handler.cc:220
#3  <signal handler called>
#4  0x00007f1cdffab1d7 in raise () from /lib64/libc.so.6
#5  0x00007f1cdffac8c8 in abort () from /lib64/libc.so.6
#6  0x0000000001ab7aee in ut_dbg_assertion_failed (expr=0x2133a7a "ret == 0", file=0x21338a8 "/mnt/hgfs/repos/percona-server/storage/innobase/fts/fts0que.cc", line=3559) at /mnt/hgfs/repos/percona-server/storage/innobase/ut/ut0dbg.cc:67
#7  0x0000000001bf0391 in fts_query_calculate_ranking (query=0x7f1cd811d4a0, ranking=0x7f1c901a2838) at /mnt/hgfs/repos/percona-server/storage/innobase/fts/fts0que.cc:3559
#8  0x0000000001bf09d0 in fts_query_prepare_result (query=0x7f1cd811d4a0, result=0x7f1c900b68c8) at /mnt/hgfs/repos/percona-server/storage/innobase/fts/fts0que.cc:3729
#9  0x0000000001bf0b57 in fts_query_get_result (query=0x7f1cd811d4a0, result=0x0) at /mnt/hgfs/repos/percona-server/storage/innobase/fts/fts0que.cc:3772
#10 0x0000000001bf169f in fts_query (trx=0x7f1cd9de18c0, index=0x7f1c9014a8c0, flags=9, query_str=0x7f1c900361e0 "xyz mnt", query_len=7, result=0x7f1cd811f9e8, limit=5) at /mnt/hgfs/repos/percona-server/storage/innobase/fts/fts0que.cc:4116
#11 0x000000000188b692 in ha_innobase::ft_init_ext (this=0x7f1c9019ea00, flags=9, keynr=1, key=0x7f1c9003e438) at /mnt/hgfs/repos/percona-server/storage/innobase/handler/ha_innodb.cc:9397
#12 0x000000000188b860 in ha_innobase::ft_init_ext_with_hints (this=0x7f1c9019ea00, keynr=1, key=0x7f1c9003e438, hints=0x7f1c9003f110) at /mnt/hgfs/repos/percona-server/storage/innobase/handler/ha_innodb.cc:9437
#13 0x0000000000f9cf51 in Item_func_match::init_search (this=0x7f1c9003e2f8, thd=0x7f1c9002c420) at /mnt/hgfs/repos/percona-server/sql/item_func.cc:7817
#14 0x00000000014890f7 in init_ftfuncs (thd=0x7f1c9002c420, select_lex=0x7f1c9003d5d0) at /mnt/hgfs/repos/percona-server/sql/sql_base.cc:9910
#15 0x00000000014f45b3 in JOIN::optimize_fts_query (this=0x7f1c9003f130) at /mnt/hgfs/repos/percona-server/sql/sql_optimizer.cc:10733
#16 0x00000000014db2b7 in JOIN::optimize (this=0x7f1c9003f130) at /mnt/hgfs/repos/percona-server/sql/sql_optimizer.cc:503
#17 0x00000000015522f4 in st_select_lex::optimize (this=0x7f1c9003d5d0, thd=0x7f1c9002c420) at /mnt/hgfs/repos/percona-server/sql/sql_select.cc:1009
#18 0x0000000001550a50 in handle_query (thd=0x7f1c9002c420, lex=0x7f1c9002e718, result=0x7f1c9003ed90, added_options=0, removed_options=0) at /mnt/hgfs/repos/percona-server/sql/sql_select.cc:164
#19 0x0000000001506a7c in execute_sqlcom_select (thd=0x7f1c9002c420, all_tables=0x7f1c9003e6a8) at /mnt/hgfs/repos/percona-server/sql/sql_parse.cc:5158
#20 0x00000000015002fe in mysql_execute_command (thd=0x7f1c9002c420, first_level=true) at /mnt/hgfs/repos/percona-server/sql/sql_parse.cc:2795
#21 0x0000000001507902 in mysql_parse (thd=0x7f1c9002c420, parser_state=0x7f1cd8121590) at /mnt/hgfs/repos/percona-server/sql/sql_parse.cc:5574
#22 0x00000000014fd1d4 in dispatch_command (thd=0x7f1c9002c420, com_data=0x7f1cd8121cf0, command=COM_QUERY) at /mnt/hgfs/repos/percona-server/sql/sql_parse.cc:1461
#23 0x00000000014fc0c6 in do_command (thd=0x7f1c9002c420) at /mnt/hgfs/repos/percona-server/sql/sql_parse.cc:999
#24 0x000000000162c436 in handle_connection (arg=0x4f50f80) at /mnt/hgfs/repos/percona-server/sql/conn_handler/connection_handler_per_thread.cc:300
#25 0x0000000001cada81 in pfs_spawn_thread (arg=0x4f714d0) at /mnt/hgfs/repos/percona-server/storage/perfschema/pfs.cc:2188
#26 0x00007f1ce1c12dc5 in start_thread () from /lib64/libpthread.so.0
#27 0x00007f1ce006d76d in clone () from /lib64/libc.so.6

How to repeat:
In debug serveer run the following fragment:

CREATE TABLE t1 (
       FTS_DOC_ID BIGINT UNSIGNED NOT NULL PRIMARY KEY,
       contents TEXT COLLATE utf8mb4_bin,
       FULLTEXT KEY fx_contents (contents) WITH PARSER ngram
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

INSERT INTO t1 VALUES (1, 'xyz mnt'), (2, 'xyz mnt'), (3, 'xyz');

# The query completes with or without bug
SELECT * FROM t1 WHERE MATCH(contents) AGAINST ('xyz' IN BOOLEAN MODE);

# The query returns wrong result set with the bug
SELECT * FROM t1 WHERE MATCH(contents) AGAINST ('xyz' IN BOOLEAN MODE) LIMIT 1;

# The query crashes with the bug
SELECT * FROM t1 WHERE MATCH(contents) AGAINST ('xyz mnt' IN BOOLEAN MODE) LIMIT 5;
[23 Aug 2017 19:07] Yura Sorokin
Possibly a duplicate of Bug #85835
[23 Aug 2017 19:17] Yura Sorokin
Fix for 5.7

Attachment: bug87518_5.7.diff (application/octet-stream, text), 4.03 KiB.

[23 Aug 2017 19:18] Yura Sorokin
Also affects 8.0.2
[23 Aug 2017 19:50] Miguel Solorzano
Thank you for the bug report. Not repeatable with most recent source server:

c:\dbs>c:\dbs\5.7\bin\mysql -uroot --port=3570 -p --local-infile  --prompt="mysql 5.7 > "
Enter password: ******
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.20-debug Source distribution 2017-AUG-15

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.7 > CREATE DATABASE d5;
Query OK, 1 row affected (0.00 sec)

mysql 5.7 > USE d5
Database changed
mysql 5.7 > CREATE TABLE t1 (
    ->        FTS_DOC_ID BIGINT UNSIGNED NOT NULL PRIMARY KEY,
    ->        contents TEXT COLLATE utf8mb4_bin,
    ->        FULLTEXT KEY fx_contents (contents) WITH PARSER ngram
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
Query OK, 0 rows affected (2.19 sec)

mysql 5.7 >
mysql 5.7 > INSERT INTO t1 VALUES (1, 'xyz mnt'), (2, 'xyz mnt'), (3, 'xyz');
Query OK, 3 rows affected (0.05 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql 5.7 >
mysql 5.7 > # The query completes with or without bug
mysql 5.7 > SELECT * FROM t1 WHERE MATCH(contents) AGAINST ('xyz' IN BOOLEAN MODE);
+------------+----------+
| FTS_DOC_ID | contents |
+------------+----------+
|          1 | xyz mnt  |
|          2 | xyz mnt  |
|          3 | xyz      |
+------------+----------+
3 rows in set (0.07 sec)

mysql 5.7 >
mysql 5.7 > # The query returns wrong result set with the bug
mysql 5.7 > SELECT * FROM t1 WHERE MATCH(contents) AGAINST ('xyz' IN BOOLEAN MODE) LIMIT 1;
+------------+----------+
| FTS_DOC_ID | contents |
+------------+----------+
|          1 | xyz mnt  |
+------------+----------+
1 row in set (0.00 sec)

mysql 5.7 >
mysql 5.7 > # The query crashes with the bug
mysql 5.7 > SELECT * FROM t1 WHERE MATCH(contents) AGAINST ('xyz mnt' IN BOOLEAN MODE) LIMIT 5;
+------------+----------+
| FTS_DOC_ID | contents |
+------------+----------+
|          1 | xyz mnt  |
|          2 | xyz mnt  |
|          3 | xyz      |
+------------+----------+
3 rows in set (0.00 sec)

mysql 5.7 >

c:\dbs>c:\dbs\8.0\bin\mysql -uroot --port=3580 -p --prompt="mysql 8.0 > " --default-character-set=utf8mb4
Enter password: ******
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 8.0.3-rc-debug Source distribution 2017-AUG-15

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 8.0 > CREATE DATABASE d5;
Query OK, 1 row affected (0.09 sec)

mysql 8.0 > USE d5
Database changed
mysql 8.0 > CREATE TABLE t1 (
    ->        FTS_DOC_ID BIGINT UNSIGNED NOT NULL PRIMARY KEY,
    ->        contents TEXT COLLATE utf8mb4_bin,
    ->        FULLTEXT KEY fx_contents (contents) WITH PARSER ngram
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
Query OK, 0 rows affected (4.34 sec)

mysql 8.0 >
mysql 8.0 > INSERT INTO t1 VALUES (1, 'xyz mnt'), (2, 'xyz mnt'), (3, 'xyz');
Query OK, 3 rows affected (0.05 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql 8.0 >
mysql 8.0 > # The query completes with or without bug
mysql 8.0 > SELECT * FROM t1 WHERE MATCH(contents) AGAINST ('xyz' IN BOOLEAN MODE);
+------------+----------+
| FTS_DOC_ID | contents |
+------------+----------+
|          1 | xyz mnt  |
|          2 | xyz mnt  |
|          3 | xyz      |
+------------+----------+
3 rows in set (0.08 sec)

mysql 8.0 >
mysql 8.0 > # The query returns wrong result set with the bug
mysql 8.0 > SELECT * FROM t1 WHERE MATCH(contents) AGAINST ('xyz' IN BOOLEAN MODE) LIMIT 1;
+------------+----------+
| FTS_DOC_ID | contents |
+------------+----------+
|          1 | xyz mnt  |
+------------+----------+
1 row in set (0.00 sec)

mysql 8.0 >
mysql 8.0 > # The query crashes with the bug
mysql 8.0 > SELECT * FROM t1 WHERE MATCH(contents) AGAINST ('xyz mnt' IN BOOLEAN MODE) LIMIT 5;
+------------+----------+
| FTS_DOC_ID | contents |
+------------+----------+
|          1 | xyz mnt  |
|          2 | xyz mnt  |
|          3 | xyz      |
+------------+----------+
3 rows in set (0.00 sec)
[25 Dec 2018 2:14] Shaohua Wang
The reason you can't reproduce it because of the following fix:

commit 48e07ec3935ab229c8ec5aede71ae77ede65a789
Author: Aakanksha Verma <aakanksha.verma@oracle.com>
Date:   Wed Aug 2 18:12:02 2017 +0530

but it's totally wrong: it makes the original commit not work.

Bug #22709692 FTS QUERY EXCEEDS RESULT CACHE LIMIT

Problem:
========
Given query in bug page involves limit clause query and it also
doesn't involve ranking, it leads to build whole doc_id tree.
If there is 1 million records involved then it easily
exceeds the result cache limit.

Solution:
========
optimization can be done for limit clause query when
(1) There is no ranking involved.
(2) Only FTS Union operations involved.

Reviewed-by: Shaohua Wang <shaohua.wang@oracle.com>
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 12588
[25 Dec 2018 3:24] Shaohua Wang
The original patch by Thiru has two issues:
1. should consider duplication and deleted doc id when counting n_docs;
2. should not apply limit optimization on phrase search with parser.
[25 Dec 2018 3:25] Shaohua Wang
Fix for 5.7 from Tencent TXSQL Team

Attachment: fts_query_with_limit_parser.diff (application/octet-stream, text), 6.93 KiB.