Bug #8734 more fine grained control over comparisons and sorting
Submitted: 23 Feb 2005 16:20 Modified: 17 May 2005 21:08
Reporter: Matthew Lord Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Charsets Severity:S4 (Feature request)
Version:4.1+ OS:Any (all)
Assigned to: CPU Architecture:Any

[23 Feb 2005 16:20] Matthew Lord
Description:
I would like more fine grained control over how searching is done so that I can get accent sensitive comparisons and sorthing when I want to while still doing case insenstive searches.

Some examples are google and altavista:
http://144.16.72.189/is213/tutor-barker-altavista.htm

Basically if you search for "video" you get an accent insensitive 
search, but if you search for "videó" you get an accent sensitive 
search.

It appears google is always accent sensitive in English, but follows 
language rules if you have changed languages:
http://www.googleguide.com/interpreting_queries.html
Says:
Search all [English] pages: [ Martín ] matches "Martín"  but not 
"Martin"
Search Spanish pages: [ Martín ] matches "Martín"  and "Martin"

MS SQL Server provides definitions of "*_AI" or "*_AS" collations for accent insenitive or accent sensitive.

How to repeat:
set session character_set_client = utf8;
set session character_set_connection = utf8;
set session character_set_results = utf8;
set session character_set_server = utf8;

DROP DATABASE IF EXISTS select_test;
CREATE DATABASE select_test DEFAULT CHARACTER SET utf8;
USE select_test;

CREATE TABLE terms (
id int unsigned NOT NULL auto_increment,
list_id smallint unsigned NOT NULL,
term TEXT NOT NULL,
PRIMARY KEY(id),
INDEX(list_id, term(19))
) TYPE=MyISAM CHARSET=utf8;

INSERT INTO terms set list_id = 1, term = "testétest";
INSERT INTO terms set list_id = 2, term = "testetest";
INSERT INTO terms set list_id = 3, term = "testètest";

select list_id from terms match (term) against ('Testetest' in boolean mode);

select list_id from terms match (term) against ('Testètest' in boolean mode);