| Bug #12075 | FULLTEXT non-functional for big5 strings | ||
|---|---|---|---|
| Submitted: | 21 Jul 2005 0:12 | Modified: | 7 Aug 2005 1:08 | 
| Reporter: | Kolbe Kegel | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server | Severity: | S3 (Non-critical) | 
| Version: | 4.1 5.0 | OS: | Linux (Linux) | 
| Assigned to: | Sergey Vojtovich | CPU Architecture: | Any | 
   [2 Aug 2005 7:38]
   Alexander Barkov        
  Patch: bk commit - 4.1 tree (svoj:1.2364) BUG#12075
   [2 Aug 2005 7:39]
   Alexander Barkov        
  Sergey, your patch looks fine. Please move test from "fulltext" to "ctype_big5", to skip this test block when no big5 is incompiled.
   [2 Aug 2005 9:27]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/27797
   [3 Aug 2005 6:23]
   Sergey Vojtovich        
  Fixed in 4.1.14, 5.0.11.
   [7 Aug 2005 1:08]
   Mike Hillyer        
  Documented in 5.0.11 and 4.1.14 changelogs: <listitem><para> <literal>big5</literal> strings were not being stored in <literal>FULLTEXT</literal> index. (Bug #12075) </para></listitem>


Description: Fulltext searching is not functional on columns stored using the big5 character set. The fulltext index remains empty even when rows have been inserted into the table. No errors or warnings are issued. How to repeat: create table u (c char(50) character set big5 not null, fulltext index(c)); insert into u (c) values (0xA741ADCCA66EB6DC20A7DAADCCABDCA66E); select * from u where match(c) against (0xA741ADCCA66EB6DC in boolean mode); select * from u where match(c) against (0xA7DAADCCABDCA66E in boolean mode); (Note that a space (0x20) appears in the hex string inserted into the table.) kolbe@lith:/var/mysql/data/test$ ../../bin/myisam_ftdump -d -v u 0 [this command outputs nothing] insert into u values ('paragraphs and sentences written in latin or roman'); kolbe@lith:/var/mysql/data/test$ ../../bin/myisam_ftdump -d -v u 0 65 0.9456265 latin 65 0.9456265 paragraphs 65 0.9456265 roman 65 0.9456265 sentences 65 0.9456265 written truncate table u; insert into u values ('paragraphs and sentences' || 0xA741ADCCA66EB6DC); kolbe@lith:/var/mysql/data/test$ ../../bin/myisam_ftdump -d -v u 0 194 0.9775171 paragraphs 194 0.9775171 sentences This seems to indicate that the correct data is not being stored in the fulltext index for some reason when the data is encoded using big5. Note that a string consisting of latin characters (which are legal in big5) and Chinese characters results in the omission of the Chinese words from the index, while the latin words are included as usual. Suggested fix: Unknown.