| Bug #31107 | SUBSTR produces wrong result if it used within function with utf8 and with TEXT | ||
|---|---|---|---|
| Submitted: | 20 Sep 2007 12:42 | Modified: | 7 Nov 2008 16:19 |
| Reporter: | Victoria Reznichenko | Email Updates: | |
| Status: | Duplicate | Impact on me: | |
| Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
| Version: | 5.0.46 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[15 Oct 2007 8:59]
Alexander Barkov
Victoria,
I cannot reproduce this problem.
This is the test script:
set names utf8;
select version();
DROP FUNCTION IF EXISTS test1;
DELIMITER //
CREATE FUNCTION test1(str TEXT) RETURNS TEXT
BEGIN
DECLARE _rem TEXT CHARACTER SET utf8;
DECLARE _pos INT DEFAULT 2;
SET _rem = str;
SET _pos = 2;
SET _rem = SUBSTR( _rem, _pos + 1 );
RETURN _rem;
END//
DELIMITER ;
SELECT test1('A,B,C,X,Y,Z');
DROP FUNCTION test1;
This is its output with different versions:
version()
5.0.22
test1('A,B,C,X,Y,Z')
B,C,X,Y,Z
version()
5.0.46-debug
test1('A,B,C,X,Y,Z')
B,C,X,Y,Z
version()
5.0.50-debug
test1('A,B,C,X,Y,Z')
B,C,X,Y,Z
[15 Oct 2007 11:10]
MySQL Verification Team
server character set should be utf8 as well:
mysql> set names utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> select version();
+-----------------------+
| version() |
+-----------------------+
| 5.0.46-enterprise-gpl |
+-----------------------+
1 row in set (0.00 sec)
mysql> DROP FUNCTION IF EXISTS test1;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> DELIMITER //
mysql> CREATE FUNCTION test1(str TEXT) RETURNS TEXT
-> BEGIN
-> DECLARE _rem TEXT CHARACTER SET utf8;
-> DECLARE _pos INT DEFAULT 2;
-> SET _rem = str;
-> SET _pos = 2;
-> SET _rem = SUBSTR( _rem, _pos + 1 );
-> RETURN _rem;
-> END//
Query OK, 0 rows affected (0.00 sec)
mysql> DELIMITER ;
mysql> SELECT test1('A,B,C,X,Y,Z');
+----------------------+
| test1('A,B,C,X,Y,Z') |
+----------------------+
| Y,Z |
+----------------------+
1 row in set (0.00 sec)
mysql> DROP FUNCTION test1;
Query OK, 0 rows affected (0.00 sec)
[29 Jul 2008 11:01]
Dave Marshall
I can confirm the same problem. Putting the result of the substring into a different TEXT variable works.
mysql> SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT VERSION();
+------------------------+
| VERSION() |
+------------------------+
| 5.0.51a-3ubuntu5.1-log |
+------------------------+
1 row in set (0.05 sec)
mysql> DROP FUNCTION IF EXISTS test1;
Query OK, 0 rows affected (0.00 sec)
mysql> DELIMITER //
mysql> CREATE FUNCTION test1(str TEXT) RETURNS TEXT NO SQL
-> BEGIN
-> DECLARE _rem TEXT CHARACTER SET utf8;
-> DECLARE _pos INT DEFAULT 2;
-> SET _rem = str;
-> SET _pos = 2;
-> SET _rem = SUBSTR(_rem, _pos + 1);
-> RETURN _rem;
-> END//
Query OK, 0 rows affected (0.00 sec)
mysql> DELIMITER ;
mysql> SELECT test1('A,B,C,X,Y,Z');
+----------------------+
| test1('A,B,C,X,Y,Z') |
+----------------------+
| Y,Z |
+----------------------+
1 row in set (0.00 sec)
mysql> DROP FUNCTION test1;
Query OK, 0 rows affected (0.00 sec)
mysql> DELIMITER //
mysql> CREATE FUNCTION test1(str TEXT) RETURNS TEXT NO SQL
-> BEGIN
-> DECLARE _rem TEXT CHARACTER SET utf8;
-> DECLARE _rem_temp TEXT CHARACTER SET utf8;
-> DECLARE _pos INT DEFAULT 2;
-> SET _rem = str;
-> SET _pos = 2;
-> SET _rem_temp = SUBSTR(_rem, _pos + 1);
-> RETURN _rem_temp;
-> END//
Query OK, 0 rows affected (0.00 sec)
mysql> DELIMITER ;
mysql> SELECT test1('A,B,C,X,Y,Z');
+----------------------+
| test1('A,B,C,X,Y,Z') |
+----------------------+
| B,C,X,Y,Z |
+----------------------+
1 row in set (0.00 sec)
mysql> DROP FUNCTION test1;
Query OK, 0 rows affected (0.00 sec)
mysql>
[31 Oct 2008 14:21]
Sergei Glukhov
The bugs is not repeatable on latest main 5.0,5.1,6.0 trees:
----------
+show variables like "character%";
+Variable_name Value
+character_set_client utf8
+character_set_connection utf8
+character_set_database utf8
+character_set_filesystem binary
+character_set_results utf8
+character_set_server utf8
+character_set_system utf8
+character_sets_dir /home/gluh/MySQL/mysql-6.0/sql/share/charsets/
+CREATE FUNCTION test1(str TEXT) RETURNS TEXT NO SQL
+BEGIN
+DECLARE _rem TEXT CHARACTER SET utf8;
+DECLARE _pos INT DEFAULT 2;
+SET _rem = str;
+SET _pos = 2;
+SET _rem = SUBSTR( _rem, _pos + 1 );
+RETURN _rem;
+END//
+SELECT test1('A,B,C,X,Y,Z');
+test1('A,B,C,X,Y,Z')
+B,C,X,Y,Z
+DROP FUNCTION IF EXISTS test1;
+CREATE FUNCTION test1(str TEXT) RETURNS TEXT NO SQL
+BEGIN
+DECLARE _rem TEXT CHARACTER SET utf8;
+DECLARE _rem_temp TEXT CHARACTER SET utf8;
+DECLARE _pos INT DEFAULT 2;
+SET _rem = str;
+SET _pos = 2;
+SET _rem_temp = SUBSTR(_rem, _pos + 1);
+RETURN _rem_temp;
+END//
+SELECT test1('A,B,C,X,Y,Z');
+test1('A,B,C,X,Y,Z')
+B,C,X,Y,Z
----------
[7 Nov 2008 13:30]
MySQL Verification Team
looks like bug #38469 ? need valgrind to test properly.
[7 Nov 2008 13:46]
Susanne Ebrecht
I have the same problem like Alexander Barkov here. Doesn't matter which version. I can't repeat this on my systems. I tested 5.0.45, 5.0.46, 5.0.51a and actual source trees for 5.0, 5.1 and 6.0. It seems this is OS/distribution related. I found a machine where this behaviour is repeatable with 5.0.70 and I found a hint that this is already fixed in 5.0.72. I will test this ...
[7 Nov 2008 16:18]
Susanne Ebrecht
I could reproduce this on Red Hat Enterprise Linux by using MySQL 5.0.46. On same machine I tested with bzr tree version (5.0.74 at the moment) and all worked fine. This bug got fixed by fix of bug #38469. I will set this bug here as duplicate of the other bug because there you will find more fixing informations.

Description: If SUBSTR() is used within function with TEXT column and utf8 character set it gives wrong result. Here is an example of function: mysql> create function test1(str TEXT) RETURNS TEXT BEGIN DECLARE _rem TEXT CHARACTER SET utf8; DECLARE _pos INT DEFAULT 2; SET _rem = str; SET _pos = 2; SET _rem = SUBSTR( _rem, _pos + 1 ); RETURN _rem; end// Query OK, 0 rows affected (0.00 sec) mysql> select test1('A,B,C,X,Y,Z')// +----------------------+ | test1('A,B,C,X,Y,Z') | +----------------------+ | Y,Z | +----------------------+ 1 row in set (0.00 sec) expected result is 'B,C,X,Y,Z' If I use IN parameter in SUBSTR directly, it works fine: mysql> create function test2(str TEXT) RETURNS TEXT BEGIN RETURN SUBSTR( str, 3 ); end// Query OK, 0 rows affected (0.00 sec) mysql> select test2('A,B,C,X,Y,Z')// +----------------------+ | test2('A,B,C,X,Y,Z') | +----------------------+ | B,C,X,Y,Z | +----------------------+ 1 row in set (0.00 sec) Also if I use VARCHAR(100) instead of TEXT it also works fine: mysql> create function test3(str VARCHAR(100)) RETURNS VARCHAR(100) BEGIN DECLARE _rem VARCHAR(100); SET _rem = str; SET _rem = SUBSTR( _rem, 3 ); RETURN _rem; end// Query OK, 0 rows affected (0.00 sec) mysql> select test3('A,B,C,X,Y,Z')// +----------------------+ | test3('A,B,C,X,Y,Z') | +----------------------+ | B,C,X,Y,Z | +----------------------+ 1 row in set (0.00 sec) mysql> show variables like "character%"; +--------------------------+-------------------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /data2/users/victoria/mysql5046/share/mysql/charsets/ | +--------------------------+-------------------------------------------------------+ 8 rows in set (0.00 sec) How to repeat: DELIMITER // CREATE FUNCTION test1(str TEXT) RETURNS TEXT BEGIN DECLARE _rem TEXT CHARACTER SET utf8; DECLARE _pos INT DEFAULT 2; SET _rem = str; SET _pos = 2; SET _rem = SUBSTR( _rem, _pos + 1 ); RETURN _rem; end// DELIMITER ; select test1('A,B,C,X,Y,Z');