Bug #113879 An error is reported when the user variable character set is used in the stored
Submitted: 3 Feb 2024 10:26 Modified: 5 Feb 2024 11:29
Reporter: huazai huazai Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:8.0.35 OS:Any
Assigned to: CPU Architecture:Any
Tags: charsets

[3 Feb 2024 10:26] huazai huazai
Description:
If the character set of a user variable is changed in a stored procedure, a character set error is reported. This problem also occurs in the function scenario.
This problem is caused by a new feature in 8.0.22:https://github.com/mysql/mysql-server/commit/67c3c70e4895874d43434f1df556f9f30d781b48
8.0.21 and earlier versions do not have this problem.

How to repeat:
Case1:
drop procedure if exists testproc;

delimiter //
CREATE PROCEDURE testproc(IN language varchar(20))
BEGIN
IF language = 'en' THEN
    set @val = curdate();
END IF;
IF language = 'cn' THEN
    set @val = '中文';
END IF;
select @val;
END //
delimiter ;
CALL testproc('en'); 
CALL testproc('cn');

Case2:
DROP FUNCTION IF EXISTS my_function;
DELIMITER //
CREATE FUNCTION my_function(language varchar(20))
RETURNS INT
BEGIN
DECLARE p_parameter_value VARCHAR(5000);
IF language = 'en' THEN
  set @val = curdate();
END IF;
IF language = 'cn' THEN
  set @val = '中文';
END IF;
SET p_parameter_value = @val;
return 1;
END //
DELIMITER ;
SELECT my_function('en');
SELECT my_function('cn');

Result:
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.35    |
+-----------+
1 row in set (0.00 sec)

mysql> create database test;
Query OK, 1 row affected (0.01 sec)

mysql> use test;
Database changed
mysql> drop procedure if exists testproc;
L testproc('en'); 
CALL testproc('cn');Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> 
mysql> delimiter //
mysql> CREATE PROCEDURE testproc(IN language varchar(20))
    -> BEGIN
    -> IF language = 'en' THEN
    ->     set @val = curdate();
    -> END IF;
    -> IF language = 'cn' THEN
    ->     set @val = '中文';
    -> END IF;
    -> select @val;
    -> END //
Query OK, 0 rows affected (0.01 sec)

mysql> delimiter ;
mysql> CALL testproc('en'); 
+------------+
| @val       |
+------------+
| 2024-02-03 |
+------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql> DELIMITER //
mysql> CREATE FUNCTION my_function(language varchar(20))
    -> RETURNS INT
    -> BEGIN
    -> DECLARE p_parameter_value VARCHAR(5000);
    -> IF language = 'en' THEN
    ->   set @val = curdate();
    -> END IF;
    -> IF language = 'cn' THEN
    ->   set @val = '中文';
    -> END IF;
    -> SET p_parameter_value = @val;
    -> return 1;
    -> END //
Query OK, 0 rows affected (0.00 sec)

mysql> DELIMITER ;
mysql> SELECT my_function('en');
+-------------------+
| my_function('en') |
+-------------------+
|                 1 |
+-------------------+
1 row in set (0.00 sec)

mysql> SELECT my_function('cn');
ERROR 1300 (HY000): Invalid latin1 character string: '\xE4\xB8\xAD\xE6\x96\x87'
[3 Feb 2024 10:30] huazai huazai
Sorry, the execution result of Case1 is not completely displayed. The result is as follows:
mysql> CALL testproc('en'); 
+------------+
| @val       |
+------------+
| 2024-02-03 |
+------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql> CALL testproc('cn');
ERROR 1300 (HY000): Invalid latin1 character string: '\xE4\xB8\xAD\xE6\x96\x87'
[4 Feb 2024 7:16] huazai huazai
This problem also exists in the prepare statement of MySQL 8.0.35. This problem does not occur in MySQL 8.0.21.
Case 1: A character set error is reported.
set @val = curdate();
prepare stmt from'select @val;';
execute stmt;
set @val = 'Chinese';
execute stmt;

Case 2: No problem
prepare stmt from'select @val;';
set @val = curdate();
execute stmt;
set @val = 'Chinese';
execute stmt;

Execution result:
Case1:
mysql> set @val = curdate();
Query OK, 0 rows affected (0.00 sec)

mysql> prepare stmt from 'select @val;';
Query OK, 0 rows affected (0.01 sec)
Statement prepared

mysql> execute stmt;
+------------+
| @val       |
+------------+
| 2024-02-04 |
+------------+
1 row in set (0.00 sec)

mysql> set @val = '中文';
Query OK, 0 rows affected (0.00 sec)

mysql> execute stmt;
ERROR 1300 (HY000): Invalid latin1 character string: '\xE4\xB8\xAD\xE6\x96\x87'
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.35    |
+-----------+
1 row in set (0.00 sec)

Case2:
mysql> prepare stmt from 'select @val;';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> set @val = curdate();
Query OK, 0 rows affected (0.00 sec)

mysql> execute stmt;
+------------+
| @val       |
+------------+
| 2024-02-04 |
+------------+
1 row in set (0.00 sec)

mysql> set @val = '中文';
Query OK, 0 rows affected (0.00 sec)

mysql> execute stmt;
+--------+
| @val   |
+--------+
| 中文   |
+--------+
1 row in set (0.00 sec)
[5 Feb 2024 11:29] MySQL Verification Team
Hi Mr. Huazai,

Thank you for your bug report.

We have managed to reproduce it in 8.0.36.

However, this is not a bug, because '\xE4\xB8\xAD\xE6\x96\x87' truly is NOT a valid latin1 character string.

This is a consequence of the bug that was fixed since 8.0.2X.

You should examine all the character sets that you have set in the server, as well as in the interface and in the client side. MySQL has 7 (seven) different character sets and collations that can be set, including the ones for the connection and client-side. Beside those, you can introduce others in your SQL statements by explicitly setting them for constants, introducers and in several other different manners.

This is all fully explained in our Reference Manual.

Not a bug.
[6 Feb 2024 3:17] Xiong Wang
Hi,

I consider there is a problem. This is because character_set and collation will be cached during Prepared statement is prepared. This is incorrect. Look at the following test:

mysql> set @val = curdate();
Query OK, 0 rows affected (0.00 sec)

mysql> prepare stmt from'select @val;';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> execute stmt;
+------------+
| @val       |
+------------+
| 2024-02-06 |
+------------+
1 row in set (0.00 sec)

mysql> set @val = '中文';
Query OK, 0 rows affected (0.00 sec)

mysql> execute stmt;
ERROR 1300 (HY000): Invalid latin1 character string: '\xE4\xB8\xAD\xE6\x96\x87'

Error is reported in the above.But if we execute the above prepared statement in a different order, we can see there is no any problem.
mysql> set @val = '中文';
Query OK, 0 rows affected (0.00 sec)

mysql> prepare stmt from'select @val;';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> execute stmt;
+--------+
| @val   |
+--------+
| 中文   |
+--------+
1 row in set (0.00 sec)

mysql> set @val = curdate();
Query OK, 0 rows affected (0.00 sec)

mysql> execute stmt;
+------------+
| @val       |
+------------+
| 2024-02-06 |
+------------+
1 row in set (0.00 sec)

The reason is character set cached is UTF8 when prepare stmt after set @val = '中文' instead of latin.
[6 Feb 2024 11:04] MySQL Verification Team
Hi Mr. Wang,

This is not about cacheing anything with prepared statements.

Simply, one of the 7 (seven) character sets and  collations that you have defined on your system is latin1. Hence, curdate() conforms to latin1 and your ideograms do not.

Nothing has been cached. 

We do not see any evidence about how your seven character sets on the MySQL system have been defined. This is all explained in our Reference Manual.

Not a bug.
[6 Feb 2024 12:09] Roy Lyseng
Unfortunately, it is difficult to combine the loose type system of user variables with the deterministic type system implemented by prepared statements and stored procedures and fulfill all user requirements. It would be much simpler if one could declare the proper type for a user variable.

However, one way to overcome this problem is to use a local variable instead of
a user variable. In other words, redefine the stored procedure as follows:

CREATE PROCEDURE testproc(IN language varchar(20))
BEGIN
DECLARE val VARCHAR(255);
IF language = 'en' THEN
    set val = curdate();
END IF;
IF language = 'cn' THEN
    set val = '中文';
END IF;
select val;
END