MySQL Bugs: #3349: Stored procedures with non-English names are displayed wrong

Bug #3349	Stored procedures with non-English names are displayed wrong
Submitted:	31 Mar 2004 14:13	Modified:	12 May 2006 19:39
Reporter:	Peter Gulutzan	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: Stored Routines	Severity:	S3 (Non-critical)
Version:	5.0.1-alpha-debug	OS:	Linux (SuSE 8.2/Win XP)
Assigned to:	Alexander Nozdrin	CPU Architecture:	Any

Description:
When I use special characters for names of variables and parameters in stored 
procedures,  I find that "SHOW CREATE PROCEDURE" + "SELECT * FROM mysql.proc" 
don't display the special characters correctly. 
  
 

How to repeat:
mysql> create procedure ÿ (ÿ int) begin declare ÿ int; end;// 
Query OK, 0 rows affected (0.00 sec) 
 
mysql> show create procedure ÿ// 
+-----------+------------------------------------------------------------+ 
| Procedure | Create Procedure                                           | 
+-----------+------------------------------------------------------------+ 
| Ã¿        | CREATE PROCEDURE `db5`.`ÿ`(? int) 
begin declare ? int; end | 
+-----------+------------------------------------------------------------+ 
1 row in set (0.00 sec) 
 
mysql> create table ÿ (ÿ int)// 
Query OK, 0 rows affected (0.31 sec) 
 
mysql> show create table ÿ// 
+-------+--------------------------------------------------------------------------------------+ 
| Table | Create Table                                                                         | 
+-------+--------------------------------------------------------------------------------------+ 
| ÿ     | CREATE TABLE `ÿ` ( 
  `ÿ` int(11) default NULL 
) ENGINE=MyISAM DEFAULT CHARSET=latin1 | 
+-------+--------------------------------------------------------------------------------------+ 
1 row in set (0.00 sec)

Special characters can cause more than just display problems. I can't call or drop a 
procedure named ß, and can't use ß as a label.

I don't know a quick solution yet. It looks rather an architecture level task.

I believe all clients should see variables and labels as Y WITH DIAERESIS,
no matter the client character set is, cp850, latin1 or utf8. That means
we need to store  body of a procedure in UTF8, and change proc.body type
from BLOB to TEXT CHARACTER SET utf8.

This brings problem with character set introducers though.
If a procedure body embeds some character string with a
character set introducer, this string should NOT be converted
into utf8 during procedure creation. This string should not be
also converted into client character set during SHOW CREATE PROCEDURE.

That means we need to escape somehow accented values.

I can see possible ways are:

1. hex notation: _latin1 0xAABBCC
   Monty does not like hex notation,
   he prefers when basic latin letters are seen as is.
   I agree with him at this point.

2. U+ notation:  _latin1 'Kaj \U+00C5rno"
3. semi-hex notation, with safe characters written
    as is and accented letters escaped: _latin1 'Kaj \xC5rno"

So, before fixing this bug we need to implement either #2 or #3,
or both.

Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Brian votes for semi-hex notation, i.e. : _latin1 'Kaj \xC5rno"
I'm going to stick to this way. 

In the future we should implemente Unicode notation as well,
as it is now a Standard SQL part. The disadvantage of Unicode 
notation is that it needs more space, comparing to semi-hex
notation.

sorry, I closed it in a mistake.

See also bug#11888

Re-tested with 5.1.10-beta and cannot repeat. Looks
like this problem was fixed by another patch (I'm
speculating that it was "table name to filename
encoding").