Bug #29499 Converting 'del' from ascii to Unicode results in 'question mark'
Submitted: 2 Jul 2007 20:53 Modified: 27 Jul 2007 4:47
Reporter: Peter Gulutzan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.0.46-debug, 5.1, 4.1 OS:Linux (SUSE 10 64-bit)
Assigned to: Alexander Barkov CPU Architecture:Any

[2 Jul 2007 20:53] Peter Gulutzan
Description:
In the ASCII character set, 0x7f is DEL (delete)
http://en.wikipedia.org/wiki/ASCII
and ASCII is a subset of other character sets.

So when I convert from ascii to something else,
I should get 0x7f DEL (delete). Instead, I get
0x3f QUESTION MARK.

Apparently only the "to-Unicode" conversion is at
fault, but that has an effect on other character sets.

How to repeat:
mysql> select hex(cast(_ascii 0x7f as char(1) character set koi8r));
+-------------------------------------------------------+
| hex(cast(_ascii 0x7f as char(1) character set koi8r)) |
+-------------------------------------------------------+
| 3F                                                    |
+-------------------------------------------------------+
1 row in set (0.00 sec)

mysql> create table t5 (s1 char(1) character set ascii, s2 char(1) character set ucs2);
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t5 (s1) values (0x7f);
Query OK, 1 row affected (0.00 sec)

mysql> update t5 set s2 = s1;
Query OK, 1 row affected, 1 warning (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> select hex(s2) from t5;
+---------+
| hex(s2) |
+---------+
| 003F    |
+---------+
1 row in set (0.00 sec)

mysql> select convert(s1 using latin1) from t5;
+--------------------------+
| convert(s1 using latin1) |
+--------------------------+
| ?                        |
+--------------------------+
1 row in set (0.00 sec)
[2 Jul 2007 21:02] Sveta Smirnova
Thank you for the report.

Verified as described.
[4 Jul 2007 7:50] Alexander Barkov
The patch is available here:

http://lists.mysql.com/commits/30264
[4 Jul 2007 11:38] Alexander Barkov
Pushed into 5.0.46-rpl
Pushed into 5.1.21-rpl
[20 Jul 2007 23:45] Bugs System
Pushed into 5.1.21-beta
[20 Jul 2007 23:49] Bugs System
Pushed into 5.0.48
[27 Jul 2007 4:47] Paul Dubois
Noted in 5.0.48, 5.1.21 changelogs.

Conversion of ASCII DEL (0x7F) to Unicode incorrectly resulted in
QUESTION MARK (0x3F) rather than DEL.