Bug #24690 | Stored functions: RETURNing UTF8 strings do not return UTF8_UNICODE_CI collation | ||
---|---|---|---|
Submitted: | 29 Nov 2006 12:54 | Modified: | 1 Jun 2010 8:05 |
Reporter: | Eric Jay | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Stored Routines | Severity: | S3 (Non-critical) |
Version: | 5.0, 5.1 | OS: | Linux (Linux, Windows) |
Assigned to: | Alexander Barkov | CPU Architecture: | Any |
Tags: | bfsm_2007_01_18, bfsm_2007_02_01, collation, stored function, utf8_unicode_ci |
[29 Nov 2006 12:54]
Eric Jay
[29 Nov 2006 13:04]
Eric Jay
here is the result I get on windows, 5.1.12-beta-community-nt-log; same happens on enterprise linux version 5.0.28 -------------- select version() -------------- version() 5.1.12-beta-community-nt-log -------------- SET collation_connection=utf8_unicode_ci -------------- -------------- SHOW VARIABLES LIKE '%character%' -------------- Variable_name Value character_set_client utf8 character_set_connection utf8 character_set_database utf8 character_set_filesystem binary character_set_results utf8 character_set_server utf8 character_set_system utf8 character_sets_dir C:\\Program Files\\MySQL\\MySQL Server 5.1\\share\\charsets\\ -------------- SHOW VARIABLES LIKE '%collation%' -------------- Variable_name Value collation_connection utf8_unicode_ci collation_database utf8_unicode_ci collation_server utf8_unicode_ci -------------- DROP DATABASE IF EXISTS mysqltest1 -------------- -------------- CREATE DATABASE mysqltest1 DEFAULT CHARACTER SET utf8 COLLATE UTF8_UNICODE_CI -------------- -------------- SHOW CREATE DATABASE mysqltest1 -------------- Database Create Database mysqltest1 CREATE DATABASE `mysqltest1` /*!40100 DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci */ -------------- CREATE FUNCTION f1() RETURNS CHAR(10) CHARSET UTF8 DETERMINISTIC RETURN "abcd" collate utf8_unicode_ci -------------- -------------- SHOW CREATE FUNCTION f1 -------------- Function sql_mode Create Function f1 CREATE DEFINER=`root`@`localhost` FUNCTION `f1`() RETURNS char(10) CHARSET utf8\n DETERMINISTIC\nRETURN "abcd" collate utf8_unicode_ci -------------- SELECT dtd_identifier FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_SCHEMA = "mysqltest1" AND ROUTINE_NAME = "f1" -------------- dtd_identifier char(10) CHARSET utf8 -------------- SELECT CHARSET(f1()), COLLATION(f1()) -------------- CHARSET(f1()) COLLATION(f1()) utf8 utf8_general_ci -------------- select collation(f1()) as test1 -------------- test1 utf8_general_ci -------------- select collation(concat("qwerty",f1())) as test2 -------------- test2 utf8_bin -------------- select collation(concat("qwerty" collate utf8_unicode_ci,f1())) as test3 -------------- test3 utf8_unicode_ci -------------- DROP DATABASE mysqltest1
[29 Nov 2006 13:09]
Eric Jay
on version 5.1.9 results where more consistent (see below); things apparently went wrong after fixing bugs 16676 and 16211 -------------- select version() -------------- version() 5.1.9-beta-log -------------- SET collation_connection=utf8_unicode_ci -------------- -------------- SHOW VARIABLES LIKE '%character%' -------------- Variable_name Value character_set_client utf8 character_set_connection utf8 character_set_database utf8 character_set_filesystem binary character_set_results utf8 character_set_server utf8 character_set_system utf8 character_sets_dir /usr/bin/mysql/share/mysql/charsets/ -------------- SHOW VARIABLES LIKE '%collation%' -------------- Variable_name Value collation_connection utf8_unicode_ci collation_database utf8_unicode_ci collation_server utf8_unicode_ci -------------- DROP DATABASE IF EXISTS mysqltest1 -------------- -------------- CREATE DATABASE mysqltest1 DEFAULT CHARACTER SET utf8 COLLATE UTF8_UNICODE_CI -------------- -------------- SHOW CREATE DATABASE mysqltest1 -------------- Database Create Database mysqltest1 CREATE DATABASE `mysqltest1` /*!40100 DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci */ -------------- CREATE FUNCTION f1() RETURNS CHAR(10) CHARSET UTF8 DETERMINISTIC RETURN "abcd" collate utf8_unicode_ci -------------- -------------- SHOW CREATE FUNCTION f1 -------------- Function sql_mode Create Function f1 CREATE DEFINER=`root`@`%` FUNCTION `f1`() RETURNS char(10)\n DETERMINISTIC\nRETURN "abcd" collate utf8_unicode_ci -------------- SELECT dtd_identifier FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_SCHEMA = "mysqltest1" AND ROUTINE_NAME = "f1" -------------- dtd_identifier char(10) -------------- SELECT CHARSET(f1()), COLLATION(f1()) -------------- CHARSET(f1()) COLLATION(f1()) utf8 utf8_unicode_ci -------------- select collation(f1()) as test1 -------------- test1 utf8_unicode_ci -------------- select collation(concat("qwerty",f1())) as test2 -------------- test2 utf8_unicode_ci -------------- select collation(concat("qwerty" collate utf8_unicode_ci,f1())) as test3 -------------- test3 utf8_unicode_ci -------------- DROP DATABASE mysqltest1 --------------
[1 Dec 2006 20:44]
Sveta Smirnova
Thank you for the report. According to http://dev.mysql.com/doc/refman/5.0/en/create-procedure.html I mark this report as feature request. ----<START QUOTE>---- For character data types, if there is a CHARACTER SET clause in the declaration, the specified character set and its default collation are used. If there is no such clause, the database character set and collation that are in effect at the time the routine is created are used. (These are given by the values of the character_set_database and collation_database system variables.) The COLLATE attribute is not supported. (This includes use of BINARY, because in this context BINARY specifies the binary collation of the character set.) ----<END QUOTE>----
[1 Dec 2006 20:46]
Sveta Smirnova
See also Aug, 15 comment by Alexander Nozdrin to bug #20937
[4 Dec 2006 9:56]
Eric Jay
there still is a discrepency between the documentation and the reality documentation says "If there is no such clause, the database character set and collation that are in effect at the time the routine is created are used." the example below shows the documentation not being correct or the code not working as documented this example creates a function where character set is utf8 and collation is utf8_unicode_ci at the time the function is created; using this function shows the function does not return utf8_unicode_ci but utf8_general_ci: DELIMITER | select version()| SET collation_connection=utf8_unicode_ci| SHOW VARIABLES LIKE '%character%'| SHOW VARIABLES LIKE '%collation%'| DROP DATABASE IF EXISTS mysqltest2| CREATE DATABASE mysqltest2 DEFAULT CHARACTER SET utf8 COLLATE UTF8_UNICODE_CI| SHOW CREATE DATABASE mysqltest2| use mysqltest2| -- - Create one stored function -- no explicit CHARSET-clause -- for return value; CREATE FUNCTION f2() RETURNS CHAR(10) DETERMINISTIC RETURN "abcd" | -- - Check what CHARSET-clause was unterstood; SHOW CREATE FUNCTION f2| SELECT dtd_identifier FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_SCHEMA = "mysqltest2" AND ROUTINE_NAME = "f2"| -- - See what collation is understood; if 'utf8_general_ci' is shown it is not as documented select 'if utf8_general_ci is shown it is not as documented' as comment, collation(f2()) as test1| select 'if utf8_bin is shown is it as documented?' as comment,collation(concat("qwerty",f2())) as test2| select 'if utf8_unicode_ci is shown why not in test2 above?' as comment, collation(concat("qwerty" collate utf8_unicode_ci,f2())) as test3| -- Cleanup. use test| DROP DATABASE mysqltest2| the result is as follows: -------------- select version() -------------- version() 5.1.12-beta-community-nt-log -------------- SET collation_connection=utf8_unicode_ci -------------- -------------- SHOW VARIABLES LIKE '%character%' -------------- Variable_name Value character_set_client utf8 character_set_connection utf8 character_set_database utf8 character_set_filesystem binary character_set_results utf8 character_set_server utf8 character_set_system utf8 character_sets_dir C:\\Program Files\\MySQL\\MySQL Server 5.1\\share\\charsets\\ -------------- SHOW VARIABLES LIKE '%collation%' -------------- Variable_name Value collation_connection utf8_unicode_ci collation_database utf8_unicode_ci collation_server utf8_unicode_ci -------------- DROP DATABASE IF EXISTS mysqltest2 -------------- -------------- CREATE DATABASE mysqltest2 DEFAULT CHARACTER SET utf8 COLLATE UTF8_UNICODE_CI -------------- -------------- SHOW CREATE DATABASE mysqltest2 -------------- Database Create Database mysqltest2 CREATE DATABASE `mysqltest2` /*!40100 DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci */ -------------- CREATE FUNCTION f2() RETURNS CHAR(10) DETERMINISTIC RETURN "abcd" -------------- -------------- SHOW CREATE FUNCTION f2 -------------- Function sql_mode Create Function f2 CREATE DEFINER=`root`@`localhost` FUNCTION `f2`() RETURNS char(10) CHARSET utf8\n DETERMINISTIC\nRETURN "abcd" -------------- SELECT dtd_identifier FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_SCHEMA = "mysqltest2" AND ROUTINE_NAME = "f2" -------------- dtd_identifier char(10) CHARSET utf8 -------------- select 'if utf8_general_ci is shown it is not as documented' as comment, collation(f2()) as test1 -------------- comment test1 if utf8_general_ci is shown it is not as documented utf8_general_ci -------------- select 'if utf8_bin is shown is it as documented?' as comment,collation(concat("qwerty",f2())) as test2 -------------- comment test2 if utf8_bin is shown is it as documented? utf8_bin -------------- select 'if utf8_unicode_ci is shown why not in test2 above?' as comment, collation(concat("qwerty" collate utf8_unicode_ci,f2())) as test3 -------------- comment test3 if utf8_unicode_ci is shown why not in test2 above? utf8_unicode_ci -------------- DROP DATABASE mysqltest2 -------------- are we talking about a feature request?, a bug in code?, a bug in documentation?
[4 Dec 2006 19:52]
Sveta Smirnova
Thank you for explanation. Seems to documentation be not clear about using collations. So I'll mark it as documentation bug.
[29 Dec 2008 5:40]
MySQL Verification Team
This also affects functions using latin1/latin1_general_cs.
[28 Jan 2009 15:35]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/64334 2983 Alexander Barkov 2009-01-28 # # Bug#24690 Stored functions: RETURNing UTF8 strings # do not return UTF8_UNICODE_CI collation # mysql-test/r/sp-ucs2.result: mysql-test/t/sp-ucs2.test: Adding tests sql/mysql_priv.h: Adding prototype sql/sp.cc Remember COLLATE clause for non-default collations sql/sql_parse.cc Adding new helper function sql/sql_yacc.yy - Allow "CHARACTER SET cs COLLATE cl" in SP parameters, RETURNS, DECLARE - Minor reorganization for "ASCII" and "UNICODE" related rules, to make the code more readable, also to allow these aliases: * "VARCHAR(10) ASCII BINARY" -> CHARACTER SET latin1 COLLATE latin1_bin * "VARCHAR(10) BINARY ASCII" -> CHARACTER SET latin1 COLLATE latin1_bin * "VARCHAR(10) UNICODE BINARY" -> CHARACTER SET ucs2 COLLATE ucs2_bin * "VARCHAR(10) BINARY UNICODE" -> CHARACTER SET ucs2 COLLATE ucs2_bin Previously these four aliases returned the error "This version of MySQL does not yet support return value collation". Note: This patch allows "VARCHAR(10) CHARACTER SET cs COLLATE cl" and the above four aliases. "VARCHAR(10) COLLATE cl" is still not allowed i.e. when COLLATE is given without CHARACTER SET. If we want to support this, we need an architecture decision which character set to use by default.
[6 Apr 2009 13:05]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/71450 3183 Alexander Barkov 2009-04-06 # # Bug#24690 Stored functions: RETURNing UTF8 strings # do not return UTF8_UNICODE_CI collation # mysql-test/r/sp-ucs2.result: mysql-test/t/sp-ucs2.test: Adding tests sql/mysql_priv.h: Adding prototype sql/sp.cc Remember COLLATE clause for non-default collations sql/sql_parse.cc Adding new helper function sql/sql_yacc.yy - Allow "CHARACTER SET cs COLLATE cl" in SP parameters, RETURNS, DECLARE - Minor reorganization for "ASCII" and "UNICODE" related rules, to make the code more readable, also to allow these aliases: * "VARCHAR(10) ASCII BINARY" -> CHARACTER SET latin1 COLLATE latin1_bin * "VARCHAR(10) BINARY ASCII" -> CHARACTER SET latin1 COLLATE latin1_bin * "VARCHAR(10) UNICODE BINARY" -> CHARACTER SET ucs2 COLLATE ucs2_bin * "VARCHAR(10) BINARY UNICODE" -> CHARACTER SET ucs2 COLLATE ucs2_bin Previously these four aliases returned the error "This version of MySQL does not yet support return value collation". Note: This patch allows "VARCHAR(10) CHARACTER SET cs COLLATE cl" and the above four aliases. "VARCHAR(10) COLLATE cl" is still not allowed i.e. when COLLATE is given without CHARACTER SET. If we want to support this, we need an architecture decision which character set to use by default.
[6 Apr 2009 13:06]
Alexander Barkov
Pushed into mysql-6.0-bugteam, currently marked as 6.0.11.
[8 Apr 2009 10:44]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/71631 3191 Alexander Barkov 2009-04-08 Post-fix for bug#24690 "Stored functions: RETURNing UTF8 strings do not return UTF8_UNICODE_CI collation" There was a mistake in the test file (wrong delimiter) which made "mysql-test-run --ps" fails. Fixing the delimiter. modified: mysql-test/r/sp-ucs2.result mysql-test/t/sp-ucs2.test
[6 May 2009 14:10]
Bugs System
Pushed into 6.0.12-alpha (revid:svoj@sun.com-20090506125450-yokcmvqf2g7jhujq) (version source revid:bar@mysql.com-20090408103406-7f9pswd9brptolzl) (merge vers: 6.0.11-alpha) (pib:6)
[15 May 2009 7:59]
MySQL Verification Team
Hi Bar, Do you have a plan to fix this bug on 5.1? This bug is impacting a customer. Kind regards, Mikiya
[15 May 2009 8:39]
Konstantin Osipov
Mikiya, if you beliefe the bug should be fixed in an earlier version, please request a re-triage.
[26 May 2009 19:52]
Paul DuBois
Noted in 6.0.12 changelog. Previously, the server handled character data types for a routine parameter, local routine variable created with DECLARE, or function return value as follows: If there was no CHARACTER SET attribute, the database character set and its default collation were used. If the CHARACTER SET attribute was present, the COLLATE attribute was not supported, so the character set's default collation was used. (This includes use of BINARY, because in this context BINARY specifies the binary collation of the character set.) Now for character data types, if there is a CHARACTER SET attribute in the declaration, the specified character set and its default collation is used. If the COLLATE is also present, that collation is used rather than the default collation. If there is no CHARACTER SET attribute, the database character set and collation in effect at routine creation time are used. (The database character set and collation are given by the value of the character_set_database and collation_database system variables.) Also updated http://dev.mysql.com/doc/refman/6.0/en/create-procedure.html accordingly.
[4 Jun 2009 22:10]
Carsten Brandt
Is there a workaround for using functions with utf8_unicode_ci in Verion 5.1 or 5.4? Could be something like changing the default collation of utf8 to utf8_unicode_ci for the whole server or tell him not to use utf8_general_ci at all. I did not find any information about that in the doku and also google didn't help me so far...
[5 Jun 2009 9:37]
Carsten Brandt
I found one, I tested on version 5.4 but should work on other versions too. You have to CAST the value the function returns everytime you call it: CAST(func_my_function(1, 'test') AS CHAR CHARACTER SET utf8) COLLATE utf8_unicode_ci; http://dev.mysql.com/doc/refman/5.4/en/charset-convert.html Hopefull this helps someone to easy find a solution, it took 3 full workdays for me to find the problem and solve it :-)
[24 Jul 2009 11:39]
Konstantin Osipov
This bug was fixed.
[28 Oct 2009 9:26]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/88462 2907 Alexander Barkov 2009-10-28 # # Bug#24690 Stored functions: RETURNing UTF8 strings # do not return UTF8_UNICODE_CI collation # # Bug#17903: cast to char results in binary # Regression. The character set was not being properly initialized # for CAST() with a type like CHAR(2) BINARY, which resulted in # incorrect results or even a server crash. # Backporting from mysql-6.0-codebase. mysql-test/r/sp-ucs2.result: mysql-test/t/sp-ucs2.test: Adding tests sql/mysql_priv.h: Adding prototype sql/sp.cc Remember COLLATE clause for non-default collations sql/sql_parse.cc Adding a new helper function sql/sql_yacc.yy - Allow "CHARACTER SET cs COLLATE cl" in SP parameters, RETURNS, DECLARE - Minor reorganization for "ASCII" and "UNICODE" related rules, to make the code more readable, also to allow these aliases: * "VARCHAR(10) ASCII BINARY" -> CHARACTER SET latin1 COLLATE latin1_bin * "VARCHAR(10) BINARY ASCII" -> CHARACTER SET latin1 COLLATE latin1_bin * "VARCHAR(10) UNICODE BINARY" -> CHARACTER SET ucs2 COLLATE ucs2_bin * "VARCHAR(10) BINARY UNICODE" -> CHARACTER SET ucs2 COLLATE ucs2_bin Previously these four aliases returned the error "This version of MySQL does not yet support return value collation". Note: This patch allows "VARCHAR(10) CHARACTER SET cs COLLATE cl" and the above four aliases. "VARCHAR(10) COLLATE cl" is still not allowed i.e. when COLLATE is given without CHARACTER SET. If we want to support this, we need an architecture decision which character set to use by default.
[30 Oct 2009 5:01]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/88668 2912 Alexander Barkov 2009-10-30 # # Bug#24690 Stored functions: RETURNing UTF8 strings # do not return UTF8_UNICODE_CI collation # # Bug#17903: cast to char results in binary # Regression. The character set was not being properly initialized # for CAST() with a type like CHAR(2) BINARY, which resulted in # incorrect results or even a server crash. # Backporting from mysql-6.0-codebase. mysql-test/r/sp-ucs2.result: mysql-test/t/sp-ucs2.test: Adding tests sql/mysql_priv.h: Adding prototype sql/sp.cc Remember COLLATE clause for non-default collations sql/sql_parse.cc Adding a new helper function sql/sql_yacc.yy - Allow "CHARACTER SET cs COLLATE cl" in SP parameters, RETURNS, DECLARE - Minor reorganization for "ASCII" and "UNICODE" related rules, to make the code more readable, also to allow these aliases: * "VARCHAR(10) ASCII BINARY" -> CHARACTER SET latin1 COLLATE latin1_bin * "VARCHAR(10) BINARY ASCII" -> CHARACTER SET latin1 COLLATE latin1_bin * "VARCHAR(10) UNICODE BINARY" -> CHARACTER SET ucs2 COLLATE ucs2_bin * "VARCHAR(10) BINARY UNICODE" -> CHARACTER SET ucs2 COLLATE ucs2_bin Previously these four aliases returned the error "This version of MySQL does not yet support return value collation". Note: This patch allows "VARCHAR(10) CHARACTER SET cs COLLATE cl" and the above four aliases. "VARCHAR(10) COLLATE cl" is still not allowed i.e. when COLLATE is given without CHARACTER SET. If we want to support this, we need an architecture decision which character set to use by default.
[30 Oct 2009 5:08]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/88671 3654 Alexander Barkov 2009-10-30 [merge] Null-merging Bug#24690 from mysql-next-mr-bar
[30 Oct 2009 14:40]
Paul DuBois
This is in 5.4.4.
[9 Nov 2009 11:21]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/89772 2943 Alexander Barkov 2009-11-09 # # Bug#24690 Stored functions: RETURNing UTF8 strings # do not return UTF8_UNICODE_CI collation # # Bug#17903: cast to char results in binary # Regression. The character set was not being properly initialized # for CAST() with a type like CHAR(2) BINARY, which resulted in # incorrect results or even a server crash. # Backporting from mysql-6.0-codebase. mysql-test/r/sp-ucs2.result: mysql-test/t/sp-ucs2.test: Adding tests sql/mysql_priv.h: Adding prototype sql/sp.cc Remember COLLATE clause for non-default collations sql/sql_parse.cc Adding a new helper function sql/sql_yacc.yy - Allow "CHARACTER SET cs COLLATE cl" in SP parameters, RETURNS, DECLARE - Minor reorganization for "ASCII" and "UNICODE" related rules, to make the code more readable, also to allow these aliases: * "VARCHAR(10) ASCII BINARY" -> CHARACTER SET latin1 COLLATE latin1_bin * "VARCHAR(10) BINARY ASCII" -> CHARACTER SET latin1 COLLATE latin1_bin * "VARCHAR(10) UNICODE BINARY" -> CHARACTER SET ucs2 COLLATE ucs2_bin * "VARCHAR(10) BINARY UNICODE" -> CHARACTER SET ucs2 COLLATE ucs2_bin Previously these four aliases returned the error "This version of MySQL does not yet support return value collation". Note: This patch allows "VARCHAR(10) CHARACTER SET cs COLLATE cl" and the above four aliases. "VARCHAR(10) COLLATE cl" is still not allowed i.e. when COLLATE is given without CHARACTER SET. If we want to support this, we need an architecture decision which character set to use by default.
[9 Nov 2009 11:34]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/89775 3702 Alexander Barkov 2009-11-09 [merge] Null-merging Bug#24690 Bug#17903
[20 Nov 2009 12:57]
Bugs System
Pushed into 6.0.14-alpha (revid:kostja@sun.com-20091120124947-yi6h2jbgw0kbciwm) (version source revid:epotemkin@mysql.com-20091109132131-ad1gk2d2tn9o5i3l) (merge vers: 6.0.14-alpha) (pib:13)
[23 Nov 2009 16:44]
Paul DuBois
Already fixed in 6.0.x. Setting report to NDI pending push to 5.6.x.
[11 Dec 2009 6:03]
Bugs System
Pushed into 5.6.0-beta (revid:alik@sun.com-20091211055628-ltr7fero363uev7r) (version source revid:alik@sun.com-20091211055453-717czhtezc74u8db) (merge vers: 5.6.0-beta) (pib:13)
[11 Dec 2009 19:30]
Paul DuBois
Noted in 5.6.0 changelog.
[6 Mar 2010 11:01]
Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20091211201717-03qf8ckwiw0np80p) (merge vers: 5.6.0-beta) (pib:16)
[7 Mar 2010 18:48]
Paul DuBois
Moved 5.6.0 changelog entry to 5.5.3.