Bug #78276 MySQL accepts non-ascii in ascii columns
Submitted: 30 Aug 2015 14:47 Modified: 6 Oct 2015 18:15
Reporter: Daniël van Eeden (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.7.8, 5.7.9, 5.6.26, 5.5.46 OS:Any
Assigned to: CPU Architecture:Any
Tags: ASCII, character set, charset

[30 Aug 2015 14:47] Daniël van Eeden
Description:
It should not be possible to store bytes in an ASCII column which have a value of >127 (hex 7f, bin 01111111)

This is not properly enforced. 

A direct insert (utf8 client) won't work: 
ERROR 1366 (HY000): Incorrect string value: '\xEF\xBF\xBD' for column 'name' at row 1

How to repeat:
mysql-5.7.8-rc> SHOW CREATE TABLE a1\G
*************************** 1. row ***************************
       Table: a1
Create Table: CREATE TABLE `a1` (
  `name` varchar(100) CHARACTER SET ascii NOT NULL,
  PRIMARY KEY (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql-5.7.8-rc> INSERT INTO a1 VALUES('a');
Query OK, 1 row affected (0.01 sec)

mysql-5.7.8-rc> INSERT INTO a1 VALUES(X'62');
Query OK, 1 row affected (0.00 sec)

mysql-5.7.8-rc> INSERT INTO a1 VALUES(X'81');
Query OK, 1 row affected (0.00 sec)

mysql-5.7.8-rc> SELECT name, HEX(name) FROM a1;
+------+-----------+
| name | HEX(name) |
+------+-----------+
| a    | 61        |
| b    | 62        |
| ?    | 81        |
+------+-----------+
3 rows in set (0.00 sec)

Suggested fix:
IF char_value > 127 THEN
  RAISE ERROR "Illegal character value %s for character set %s", char_value, charset
END IF
[30 Aug 2015 17:01] Daniël van Eeden
Note that incorrect UTF-8 does result in an error.

mysql-5.7.8-rc> CREATE TABLE u1 (name char(100) primary key) default character set utf8mb4;
Query OK, 0 rows affected (0.04 sec)

mysql-5.7.8-rc> INSERT INTO u1 VALUES(X'61'); -- valid 1-byte
Query OK, 1 row affected (0.01 sec)

mysql-5.7.8-rc> INSERT INTO u1 VALUES(X'C3AB'); -- valid 2-byte 
Query OK, 1 row affected (0.00 sec)

mysql-5.7.8-rc> INSERT INTO u1 VALUES(X'C3EB'); -- first byte valid (2-byte start), second byte invalid (not a continuation)
ERROR 1366 (HY000): Incorrect string value: '\xC3\xEB' for column 'name' at row 1
mysql-5.7.8-rc> INSERT INTO u1 VALUES(X'43AB'); -- first byte is single byte, second byte continuation
ERROR 1366 (HY000): Incorrect string value: '\xAB' for column 'name' at row 1
[2 Sep 2015 7:21] MySQL Verification Team
Hello Daniel,

Thank you for the report and test case.
Observed this with 5.7.8/9 and 5.6.26 builds.

Thanks,
Umesh
[4 Sep 2015 7:32] MySQL Verification Team
// 5.5.46 - with sql_mode strict and default

[umshastr@hod03]/export/umesh/server/binaries/mysql-5.5.46:  bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.46 MySQL Community Server (GPL)

Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> use test
Database changed
mysql>  CREATE TABLE `a1` (
    ->   `name` varchar(100) CHARACTER SET ascii NOT NULL,
    ->   PRIMARY KEY (`name`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO a1 VALUES('a');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO a1 VALUES(X'62');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO a1 VALUES(X'81');
Query OK, 1 row affected (0.00 sec)

mysql> SELECT name, HEX(name) FROM a1;
+------+-----------+
| name | HEX(name) |
+------+-----------+
| a    | 61        |
| b    | 62        |
| ?    | 81        |
+------+-----------+
3 rows in set (0.00 sec)

mysql>
[4 Sep 2015 7:35] MySQL Verification Team
// Same behavior with 5.1

[umshastr@hod03]/export/umesh/server/binaries/mysql-5.1.77:  bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.77 Source distribution

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> use test
Database changed
mysql> drop table if exists a1;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE TABLE `a1` (
    ->   `name` varchar(100) CHARACTER SET ascii NOT NULL,
    ->   PRIMARY KEY (`name`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    -> ;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO a1 VALUES('a');
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO a1 VALUES(X'62');
Query OK, 1 row affected (0.00 sec)

mysql>  INSERT INTO a1 VALUES(X'81');
Query OK, 1 row affected (0.01 sec)

mysql> SELECT name, HEX(name) FROM a1;
+------+-----------+
| name | HEX(name) |
+------+-----------+
| a    | 61        |
| b    | 62        |
| ?    | 81        |
+------+-----------+
3 rows in set (0.01 sec)
[6 Oct 2015 18:15] Paul DuBois
Noted in 5.7.10, 5.8.0 changelogs.

It was possible to store non-ASCII data in columns intended to store
data of character set ascii.