Bug #30500 Searching with Norwegian letters ÆØÅ
Submitted: 19 Aug 2007 21:50 Modified: 24 Aug 2007 8:37
Reporter: Martin Gråtrud Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.0.24a, 5.1.20 OS:Any
Assigned to: CPU Architecture:Any

[19 Aug 2007 21:50] Martin Gråtrud
Description:
When i do a #SELECT * FROM any WHERE anoyone LIKE 'øyv'", gives me nothing, i know there is a entery for øyvind. if i search for the word  'bjør' instead, I get bjørn. It looks like if i search for a word starting with any of the norwegian letters æøå, i get nothing, but if it is in the middle of the word, there is no problem...  

How to repeat:
Create å table in utf8_danish_ci insert som data with ther Norwegian/Danish letters ÆØÅæøå as the first letter, and just du a "SELECT * FROM anydb WHERE anytable LIKE 'øyvind'
[19 Aug 2007 21:55] Martin Gråtrud
i forgot to mention that i use UTF8-Danish CI
[20 Aug 2007 6:29] Sveta Smirnova
Thank you for the report.

But version 5.0.24a is quite old. Please upgrade to current version 5.0.45, try with it and say us result. Please don't forget to use % wildcard if needed.
[21 Aug 2007 11:23] Martin Gråtrud
Tried with 5.1.20Beta, and the same thing happends here... one more thing i can mention, is that if i search for the letter 'a', i get everything that starts on the letters 'a,æ,ø and å'. If i have a record named øyvind, i get it if i search for 'a', but if i search for 'ay' i get nothing. It's still the same if i search for a record containg, but not starting on a norwegian letter, i get the right result! (yes i'm using %)
[21 Aug 2007 17:26] Sveta Smirnova
Thank you for the feedback.

Please porvide output of SHOW CREATE TABLE anydb; SHO VARIABLES LIKE 'char%' and SHOW VARIABLES LIKE 'coll%'; (I want to check all 3 collation values)
[22 Aug 2007 14:06] Martin Gråtrud
This is my result:

Variable_name  Value  
collation_connection utf8_unicode_ci 
collation_database utf8_unicode_ci 
collation_server latin1_swedish_ci
[23 Aug 2007 7:56] Sveta Smirnova
Thank you for the feedback.

Please also provide output of SHOW CREATE TABLE anydb and SHOW VARIABLES LIKE 'char%'
[23 Aug 2007 8:24] Martin Gråtrud
character_set_client utf8 
      character_set_connection utf8 
      character_set_database utf8 
      character_set_filesystem binary 
      character_set_results utf8 
      character_set_server latin1 
      character_set_system utf8 
      character_sets_dir c:\wamp\mysql\share\charsets\
[23 Aug 2007 8:48] Sveta Smirnova
Thank you for the feedback.

Please also provide output of SHOW CREATE TABLE anydb
[23 Aug 2007 9:12] Martin Gråtrud
CREATE TABLE `recipients` (\n  `rid` int(99) NOT NULL auto_increment,\n  `fname` varchar(30) collate utf8_unicode_ci NOT NULL,\n  `lname` varchar(30) collate utf8_unicode_ci NOT NULL,\n  `sms` varchar(15) collate utf8_unicode_ci NOT NULL,\n  `email` varchar(40) collate utf8_unicode_ci NOT NULL,\n  `countrycode` varchar(5) collate utf8_unicode_ci NOT NULL,\n  `groupid` int(1) NOT NULL,\n  PRIMARY KEY  (`rid`)\n) ENGINE=MyISAM AUTO_INCREMENT=8 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
[23 Aug 2007 20:50] Sveta Smirnova
Thank you for the report.

I can not repeat described behaviour:

mysqldev@win2003-x86:~/sveta/mysql-5.0.45-win32> bin/mysql -uroot -P3350 test <bug30500.test 
Variable_name   Value
collation_connection    utf8_unicode_ci
collation_database      utf8_unicode_ci
collation_server        latin1_swedish_ci
Variable_name   Value
character_set_client    utf8
character_set_connection        utf8
character_set_database  utf8
character_set_filesystem        binary
character_set_results   utf8
character_set_server    latin1
character_set_system    utf8
character_sets_dir      f:\\sveta\\mysql-5.0.45-win32\\share\\charsets\\
rid     fname   lname   sms     email   countrycode     groupid
8       øyvind                                  0
1
1
mysqldev@win2003-x86:~/sveta/mysql-5.0.45-win32> bin/mysql -uroot -P3350 test                

drop table if exists recipients;

CREATE TABLE `recipients` (
 `rid` int(99) NOT NULL auto_increment,
 `fname` varchar(30) collate utf8_unicode_ci NOT NULL,
 `lname` varchar(30) collate utf8_unicode_ci NOT NULL,
 `sms` varchar(15) collate utf8_unicode_ci NOT NULL,
`email` varchar(40) collate utf8_unicode_ci NOT NULL,
 `countrycode` varchar(5) collate utf8_unicode_ci NOT NULL,
 `groupid` int(1) NOT NULL,
 PRIMARY KEY  (`rid`)
)
ENGINE=MyISAM AUTO_INCREMENT=8 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

insert into recipients (fname) values ('??yvind');

select * from recipients where fname like '%??yv%';

select 1;
\q
rid     fname   lname   sms     email   countrycode     groupid
8       ??yvind                                 0
1
1

Your problem looks like character set of Windows cmd.exe which doesn't support UTF8
[24 Aug 2007 8:37] Martin Gråtrud
In prodoction we run this on freeBSD, and same problem occurs ther!