Bug #2340 | TCASE function ie Title Case Function | ||
---|---|---|---|
Submitted: | 9 Jan 2004 15:44 | Modified: | 27 Nov 2005 12:39 |
Reporter: | Yurais Fernandez Leal | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: DML | Severity: | S4 (Feature request) |
Version: | OS: | Any | |
Assigned to: | CPU Architecture: | Any |
[9 Jan 2004 15:44]
Yurais Fernandez Leal
[27 Nov 2005 12:39]
Valeriy Kravchuk
Thank you for a reasonable feature request. Something like your TCASE (or something similar to Oracle's INITCAP) will be really useful.
[15 Feb 2006 14:50]
Corey Furman
I searched the manual for just this exact function! Let me add to it: Take a 3rd parameter of type int to represent how many times it should be repeated. Therefore: SELECT TCASE(" ", "AMERICAN BOOKING STORES", 1); would produce: American booking stores Not so useful in this particular example, but adding the third param and setting it to 1 would yield the "sentence case" function.
[20 Apr 2007 10:50]
Ondra Zizka
I think MySQL is a RDBMS and not text processing engine, thus I would do this task at application's business level... Not like LASTINDEXOF(), which would be handy for selecting and grouping data.
[30 Mar 2008 15:14]
Conchur Dickinson
The following MySQL function will convert a string of up to 10 words into title case. This is very useful when batch cleaning data (e.g. when importing names from other databases), so I disagree that this is purely an application level function. This function will Title Case the first 10 words it finds in a string. The remainder of the string is discarded (you have been warned!). <pre> CREATE FUNCTION `ToTitleCase`(name VARCHAR(200)) RETURNS varchar(200) CHARSET latin1 RETURN TRIM( CONCAT_WS(' ', CONCAT(UPPER(LEFT(SUBSTRING_INDEX(name, ' ',1),1)), LOWER(MID(SUBSTRING_INDEX(name, ' ',1),2))), CONCAT(UPPER(MID(SUBSTRING_INDEX(name, ' ',2),LENGTH(SUBSTRING_INDEX(name, ' ',1)) + 2, 1)), LOWER(MID(SUBSTRING_INDEX(name, ' ',2),3 + LENGTH(SUBSTRING_INDEX(name, ' ',1))))), CONCAT(UPPER(MID(SUBSTRING_INDEX(name, ' ',3),LENGTH(SUBSTRING_INDEX(name, ' ',2)) + 2, 1)), LOWER(MID(SUBSTRING_INDEX(name, ' ',3),3 + LENGTH(SUBSTRING_INDEX(name, ' ',2))))), CONCAT(UPPER(MID(SUBSTRING_INDEX(name, ' ',4),LENGTH(SUBSTRING_INDEX(name, ' ',3)) + 2, 1)), LOWER(MID(SUBSTRING_INDEX(name, ' ',4),3 + LENGTH(SUBSTRING_INDEX(name, ' ',3))))), CONCAT(UPPER(MID(SUBSTRING_INDEX(name, ' ',5),LENGTH(SUBSTRING_INDEX(name, ' ',4)) + 2, 1)), LOWER(MID(SUBSTRING_INDEX(name, ' ',5),3 + LENGTH(SUBSTRING_INDEX(name, ' ',4))))), CONCAT(UPPER(MID(SUBSTRING_INDEX(name, ' ',6),LENGTH(SUBSTRING_INDEX(name, ' ',5)) + 2, 1)), LOWER(MID(SUBSTRING_INDEX(name, ' ',6),3 + LENGTH(SUBSTRING_INDEX(name, ' ',5))))), CONCAT(UPPER(MID(SUBSTRING_INDEX(name, ' ',7),LENGTH(SUBSTRING_INDEX(name, ' ',6)) + 2, 1)), LOWER(MID(SUBSTRING_INDEX(name, ' ',7),3 + LENGTH(SUBSTRING_INDEX(name, ' ',6))))), CONCAT(UPPER(MID(SUBSTRING_INDEX(name, ' ',8),LENGTH(SUBSTRING_INDEX(name, ' ',7)) + 2, 1)), LOWER(MID(SUBSTRING_INDEX(name, ' ',8),3 + LENGTH(SUBSTRING_INDEX(name, ' ',7))))), CONCAT(UPPER(MID(SUBSTRING_INDEX(name, ' ',9),LENGTH(SUBSTRING_INDEX(name, ' ',8)) + 2, 1)), LOWER(MID(SUBSTRING_INDEX(name, ' ',9),3 + LENGTH(SUBSTRING_INDEX(name, ' ',8))))), CONCAT(UPPER(MID(SUBSTRING_INDEX(name, ' ',10),LENGTH(SUBSTRING_INDEX(name, ' ',9)) + 2, 1)), LOWER(MID(SUBSTRING_INDEX(name, ' ',10),3 + LENGTH(SUBSTRING_INDEX(name, ' ',9))))), CONCAT(UPPER(MID(SUBSTRING_INDEX(name, ' ',11),LENGTH(SUBSTRING_INDEX(name, ' ',10)) + 2, 1)), LOWER(MID(SUBSTRING_INDEX(name, ' ',11),3 + LENGTH(SUBSTRING_INDEX(name, ' ',10))))) )); </pre>
[30 Mar 2008 16:58]
Conchur Dickinson
An improved title case method is quoted below. This one uses a helper function 'GetWord' to return the Nth word of a string. The main function accepts 3 parameters: the string to alter the case of, the number of words to title case (counting from the start) and the minimum length of word to alter (to ignore small words). Again only the first 10 words are altered, but it should be fairly obvious that you can extend this if needed. The string is not truncated this time if there are more than 10 words, the rest is just returned unchanged. CREATE FUNCTION `GetWord`(stringValue VARCHAR(200), wordNumber INT) RETURNS VARCHAR(200) CHARSET latin1 RETURN MID(SUBSTRING_INDEX(stringValue, ' ',wordNumber), IF(wordNumber=1,1,2) + LENGTH(SUBSTRING_INDEX(stringValue, ' ',wordNumber - 1))); CREATE FUNCTION `ToTitleCase`(name VARCHAR(200), maxWords INT, minLength INT) RETURNS varchar(200) CHARSET latin1 RETURN TRIM(IF(LENGTH(name)>=minLength,CONCAT_WS(' ', IF(maxWords>0 AND LENGTH(GetWord(name, 1))>=minLength, CONCAT(UPPER(LEFT(GetWord(name, 1),1)), LOWER(MID(GetWord(name, 1),2))),GetWord(name, 1)), IF(maxWords>1 AND LENGTH(GetWord(name, 2))>=minLength, CONCAT(UPPER(LEFT(GetWord(name, 2),1)), LOWER(MID(GetWord(name, 2),2))),GetWord(name, 2)), IF(maxWords>2 AND LENGTH(GetWord(name, 3))>=minLength, CONCAT(UPPER(LEFT(GetWord(name, 3),1)), LOWER(MID(GetWord(name, 3),2))),GetWord(name, 3)), IF(maxWords>3 AND LENGTH(GetWord(name, 4))>=minLength, CONCAT(UPPER(LEFT(GetWord(name, 4),1)), LOWER(MID(GetWord(name, 4),2))),GetWord(name, 4)), IF(maxWords>4 AND LENGTH(GetWord(name, 5))>=minLength, CONCAT(UPPER(LEFT(GetWord(name, 5),1)), LOWER(MID(GetWord(name, 5),2))),GetWord(name, 5)), IF(maxWords>5 AND LENGTH(GetWord(name, 6))>=minLength, CONCAT(UPPER(LEFT(GetWord(name, 6),1)), LOWER(MID(GetWord(name, 6),2))),GetWord(name, 6)), IF(maxWords>6 AND LENGTH(GetWord(name, 7))>=minLength, CONCAT(UPPER(LEFT(GetWord(name, 7),1)), LOWER(MID(GetWord(name, 7),2))),GetWord(name, 7)), IF(maxWords>7 AND LENGTH(GetWord(name, 8))>=minLength, CONCAT(UPPER(LEFT(GetWord(name, 8),1)), LOWER(MID(GetWord(name, 8),2))),GetWord(name, 8)), IF(maxWords>8 AND LENGTH(GetWord(name, 9))>=minLength, CONCAT(UPPER(LEFT(GetWord(name, 9),1)), LOWER(MID(GetWord(name, 9),2))),GetWord(name, 9)), IF(maxWords>9 AND LENGTH(GetWord(name, 10))>=minLength, CONCAT(UPPER(LEFT(GetWord(name, 10),1)), LOWER(MID(GetWord(name, 10),2))),GetWord(name, 10)), IF(LENGTH(GetWord(name, 11))>0, MID(name, LENGTH(SUBSTRING_INDEX(name, ' ', 10)) + 1), NULL) ),name));
[19 Jun 2008 20:12]
Harrison Fisk
A quick SQL function which can handle any amount of words in a string is: CREATE FUNCTION pCase (str TEXT) RETURNS text DETERMINISTIC BEGIN DECLARE result TEXT default ''; DECLARE space INT default 0; DECLARE last_space INT default 0; # handle NULL IF (str IS NULL) THEN RETURN NULL; END IF; # if 0 length string given IF (char_length(str) = 0) THEN RETURN ''; END IF; # upper case the first letter SET result = upper(left(str,1)); SET space = locate(' ', str); # loop through remaining spaces WHILE space > 0 DO # add everything up to that space SET result = CONCAT(result, SUBSTRING(str, last_space+2, space-last_space-1)); # upper case the letter after the found space SET result = CONCAT(result, UPPER(SUBSTRING(str, space+1, 1))); # find next space SET last_space = space; SET space = locate(' ', str, space+2); END WHILE; # add final section SET result = CONCAT(result, SUBSTRING(str, last_space+2)); RETURN result; END;
[19 Jun 2008 20:13]
Harrison Fisk
There is also a UDF which can do this at: http://www.mysqludf.org/lib_mysqludf_str/index.php#str_ucwords
[14 Jul 2008 8:55]
Neil Kay
Harrison's function misses out a couple of important 'lower's... # add everything up to that space SET result = CONCAT(result, LOWER(SUBSTRING(str, last_space+2, space-last_space-1))); [and] # add final section SET result = CONCAT(result, LOWER(SUBSTRING(str, last_space+2)));
[24 Oct 2008 23:34]
M Azul
create function tcase(str text) returns text begin declare result text default ''; declare i int default 1; if(str is null) then return null; end if; while(i <= length(str)) do if(i = 1) or ((i > 1) and ((lower(substring(result, i - 1, 1) < 'a')) or (lower(substring(result, i - 1, 1) > 'z')))) then set result = concat(result, upper(substr(str, i, 1))); else set result = concat(result, lower(substr(str, i, 1))); end if; set i = i + 1; end while; return result; end;
[9 Dec 2008 11:24]
Steve H
Here's a function that allows you to set a word boundary list. See the word_boudary variable definition. I will capitalise after space, parenthesis etc. Thanks go to the code snippets above for the help; DELIMITER $$ DROP FUNCTION IF EXISTS `pCase` $$ CREATE FUNCTION `pCase` (str TEXT) RETURNS text DETERMINISTIC BEGIN DECLARE result TEXT default ''; DECLARE working_char TEXT default ''; DECLARE last_space INT default 0; DECLARE word_boundary TEXT default ' (){}[]-'; #Used to decide when a new word begins. Add you boundries here!! DECLARE i INT default 1; #Loop counter DECLARE found_boundary INT default 1; #When we find a boundary set to 1 (True) else 0 (False). Set to 1 so first word is capitalised. # handle NULL IF (str IS NULL) THEN RETURN NULL; END IF; # if 0 length string given IF (char_length(str) = 0) THEN RETURN ''; END IF; # upper case the first letter SET str = lower(str); #SET result = upper(left(str,1)); #SET space = locate(' ', str); # loop through each letter looking for a word boundary WHILE(i <= (length(str)+1)) DO #Set our working charater SET working_char=SUBSTRING(str, i-1, 1); #Find a word boundary IF(locate(working_char, word_boundary)>0) THEN SET found_boundary=1; #Set the boundary flag, then ignore SET result=CONCAT(result, working_char); ELSE IF(found_boundary=1) THEN SET result = CONCAT(result, UPPER(working_char)); #After a boundary so upper case SET found_boundary=0; #Reset the word boundary flag ELSE SET result = CONCAT(result, working_char); #No word boundary so keep as lower case END IF; END IF; SET i=i+1; END WHILE; RETURN result; END $$ DELIMITER ;
[19 Oct 2010 12:43]
Valeriy Kravchuk
Bug #57553 was marked as a duplicate of this one.
[16 Nov 2010 19:45]
Peter Gulutzan
This is a bit like Oracle 11g's NLS_INITCAP http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/functions097.htm#i89841 and DB2's INITCAP http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=/com.ibm.db2.luw.sql.... As the comments show, anybody can write his/her own routine. For example, the function suggested in "[25 Oct 2008 1:34] M Azul" works. It's flawed ... It's not "title case" it's "upper case". It finds word boundaries by asking if the previous character is "between a and z". It depends on default character set / collation of the function's database. It always returns TEXT regardless of input data type. But in many English applications those matters are insignificant.
[5 Jan 2011 15:49]
Steve Howie
Improved code allowing word boundaries and a list of forced upper case words (e.g. USA). SELECT pCase("Born in the uk not the usa with dvds for friends"); produces "Born In The UK Not The USA With DVDs For Friends". DELIMITER $$ DROP FUNCTION IF EXISTS `pCase` $$ CREATE DEFINER=`root`@`localhost` FUNCTION `pCase`(str TEXT) RETURNS text CHARSET latin1 DETERMINISTIC BEGIN DECLARE result TEXT default ''; DECLARE word TEXT default ''; DECLARE working_char TEXT default ''; DECLARE last_space INT default 0; DECLARE word_boundry TEXT default ' (){}[]-/'; #Used to decide when a new word begins. Add you boundries here!! DECLARE ucase_words TEXT default 'UK,USA,DVD,DVDs,GB'; #To make some words uppercase use ucase_words below. The text entered here is used so dvds becomes DVDs. DECLARE i INT default 1; #Loop counter DECLARE found_boundry INT default 1; #When we find a boundry set to 1 (True) else 0 (False) # handle NULL IF (str IS NULL) THEN RETURN NULL; END IF; # if 0 length string given IF (char_length(str) = 0) THEN RETURN ''; END IF; SET str = lower(str); # loop through each letter looking for a word boundry WHILE(i <= (length(str)+1)) DO #Set our working charater SET working_char=SUBSTRING(str, i-1, 1); #Find a word boundry IF(locate(working_char, word_boundry)>0) THEN #Check if last word was in our uppercase list, using the example in the list to allow dvds to become DVDs IF(locate(word, ucase_words)>0) THEN SET result=CONCAT(LEFT(result,(LENGTH(result)-LENGTH(word))),MID(ucase_words,locate(word, ucase_words),LENGTH(word))); END IF; SET found_boundry=1; #Set the boundry flag, then ignore SET result=CONCAT(result, working_char); SET word=''; #Reset word ELSE SET word=CONCAT(word, working_char); IF(found_boundry=1) THEN SET result = CONCAT(result, UPPER(working_char)); #After a boundry so upper case SET found_boundry=0; ELSE SET result = CONCAT(result, working_char); END IF; END IF; SET i=i+1; END WHILE; #Check if last word was in our uppercase list IF(locate(word, ucase_words)>0) THEN SET result=CONCAT(LEFT(result,(LENGTH(result)-LENGTH(word))),MID(ucase_words,locate(word, ucase_words),LENGTH(word))); END IF; RETURN result; END $$ DELIMITER ;