Bug #113811 | Get incorrect result when searching for consecutive uppercase letters | ||
---|---|---|---|
Submitted: | 30 Jan 2024 23:42 | Modified: | 1 Feb 2024 3:34 |
Reporter: | caiweiqiang cai | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: FULLTEXT search | Severity: | S3 (Non-critical) |
Version: | 8.0.27 5.7.28 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[30 Jan 2024 23:42]
caiweiqiang cai
[30 Jan 2024 23:43]
caiweiqiang cai
Sorry, I loss one SQL to create fulltext index alter table fulltext_test add fulltext content (content) with parser ngram;
[31 Jan 2024 10:40]
MySQL Verification Team
Hi Mr. cai, Thank you for your bug report. However, this is not a bug. When you use case sensitive collation, then you will find only rows that matches the constant fully, that means that 'a' would be different then 'A'. However, when you use _ci collation, which means Case Insensitive, then 'a' will be equal to 'A'. Hence, this is expected behaviour. Not a bug.
[1 Feb 2024 3:34]
caiweiqiang cai
Maybe you didn't get my point. The main problem is that this SQL does not return any record when using utf8mb4_bin collation. select a.content, match(a.content) against ('+ABCD' in boolean mode) from fulltext_test a where match(a.content) against ('+ABCD' in boolean mode); This statement queries the rows containing the phrase "ABCD". I think it should return this row "ABCDEFGHIJKLMNOPQRSTUVWXYZ", but it doesn't. It's stange behaviour.
[1 Feb 2024 11:05]
MySQL Verification Team
Hi, You are not using FTS correctly ...... Your query should be : select a.content from fulltext_test a where match(a.content) against ('ABCD'; This should return correct result ...... Introducing MATCH in the SELECT list and using a single word in the boolean mode, both go against the SQL standard.
[1 Feb 2024 11:35]
MySQL Verification Team
One more additional note. Your 8.0 release is too old. We ran your test case against 8.0.36 and got totally different results. No rows in any possible variant. With BOOLEAN or no BOOLEAN. This ia actually correct behaviour with all the settings that you used.