Bug #22638 | SOUNDEX broken for international characters | ||
---|---|---|---|
Submitted: | 24 Sep 2006 13:39 | Modified: | 3 Apr 2007 22:43 |
Reporter: | Daniel Eloff | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 5.0 BK, 4.1 BK, 5.1 BK | OS: | Linux (Linux) |
Assigned to: | Alexander Barkov | CPU Architecture: | Any |
[24 Sep 2006 13:39]
Daniel Eloff
[6 Oct 2006 8:27]
Sveta Smirnova
Thank you for the report. Verified as described: $bin/mysql -e "SELECT HEX('阅览随时更新的新闻'), HEX(SOUNDEX('阅览随时更新的新闻'));" HEX('阅览随时更新的新闻') HEX(SOUNDEX('阅览随时更新的新闻')) E99885E8A788E99A8FE697B6E69BB4E696B0E79A84E696B0E997BB E9303030
[6 Oct 2006 9:21]
Sveta Smirnova
Verified on Linux using last BK sources. All versions are affected. OS and version flags are corrected.
[28 Mar 2007 14:01]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/23158
[30 Mar 2007 7:17]
Alexander Barkov
Pushed into 5.0.40-rpl Pushed into 5.1-18-rpl
[31 Mar 2007 23:53]
Bugs System
Pushed into 5.0.40
[31 Mar 2007 23:55]
Bugs System
Pushed into 5.1.18-beta
[3 Apr 2007 22:43]
Paul DuBois
Noted in 5.0.40, 5.1.18 changelogs. SOUNDEX() returned an invalid string for international characters in multi-byte character sets.
[4 Sep 2007 16:54]
Alexander Barkov
See also: Bug#27782 MYSQL SOUNDEX collation, utf8_hungarian_ci shows false positive