Bug #103920 memory recycle improve in fts query
Submitted: 6 Jun 2021 9:52 Modified: 7 Jun 2021 8:43
Reporter: yuxiang jiang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: FULLTEXT search Severity:S3 (Non-critical)
Version:8.0, 8.0.25, 5.7.34 OS:CentOS
Assigned to: CPU Architecture:Any
Tags: Contribution, oom memory full text search

[6 Jun 2021 9:52] yuxiang jiang
Description:
If fts search return lots of matched rows in fts_query_phrase_search calling fts_index_fetch_nodes,  innodb will occupy memory which won't release after calling mem_heap_free.

How to repeat:
CREATE TABLE `t_fts`
(
    `id` int,
    `title` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL,
    `content` varchar(2048) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL,
    FULLTEXT KEY `ft_title_content_2` (`title`,`content`) /*!50100 WITH PARSER `ngram` */
);

insert into t_fts values (1, '黑龙江恒基天承建筑工程有限公司 上海东昌建筑装饰工程有限公司', '黑龙江恒基天承建筑工程有限公司 上海东昌建筑装饰工程有限公司');
#1
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#2
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#3
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#4
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#5
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#6
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#7
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#8
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#9
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#10
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#11
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#12
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#13
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#14
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#15
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#16
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#17
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#18
insert into t_fts select id + (select max(id) from t_fts), title, content from t_fts;
#19

SELECT id FROM t_fts where MATCH (title,content) AGAINST ('+工程' IN BOOLEAN MODE) limit 1;

and then checkout the total memory allocated.

Suggested fix:
1、Allocate memory in large pieces at first
2、Force calling malloc_trim to recyle memory in future
[6 Jun 2021 9:52] yuxiang jiang
for fts memory recycle

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 0001-fts-query-memory-recycle.patch (application/octet-stream, text), 1.64 KiB.

[7 Jun 2021 8:43] MySQL Verification Team
Hello yuxiang jiang,

Thank you for the report and contribution.

regards,
Umesh
[7 Jun 2021 14:48] Satya Bodapati
may be use the same malloc_trim trick on all  mem_heap_free() that frees more than X blocks (ofcourse not applicable on bufferpool allocations)

Rings me an old bug: https://bugs.mysql.com/bug.php?id=95065
[7 Jun 2021 14:51] Satya Bodapati
Current workaround is to use jemalloc or tcmalloc but good to have some kind of fix from code as well