Bug #88980 Binary data getting corrupt from utf8 table.
Submitted: 20 Dec 2017 9:24 Modified: 20 Jan 2018 10:10
Reporter: Vibin Varghese Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Charsets Severity:S2 (Serious)
Version:5.6.10 OS:Red Hat
Assigned to: CPU Architecture:Any
Tags: BINARY, charset, corrupt, driver, MySQL

[20 Dec 2017 9:24] Vibin Varghese
Description:
Hi There,

This is related to an issue possibly with the MySQL OBDC Driver handling data in binary from utf8 tables.

Environment Details
OS - RHEL 7, 64-bit
Database - Amazon Aurora (Aurora 1.11, InnoDB 1.2.10)
MySQL 5.6.10
ODBC - mysql-connector-odbc-5.3.8-1.el7.x86_64

Here is the scenario.
Table with BINARY(2) column with utf8 charset.

CREATE TABLE `VB_TEST_EMP` (
  `EMP_ID` int(11) NOT NULL,
  `EMP_BIN` binary(2) DEFAULT NULL,
  PRIMARY KEY (`EMP_ID`) COMMENT 'PK on VB_TEST_EMP Table on EMP_ID column'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 

mysql> select EMP_ID, HEX(EMP_BIN) from VB_TEST_EMP  where EMP_ID = 1 ;
+--------+--------------+
| EMP_ID | HEX(EMP_BIN) |
+--------+--------------+
|      1 | 8080         |
+--------+--------------+
1 row in set (0.00 sec)

I'm running a COBOL application pulling data from MySQL over ODBC Driver.

Settings of interest are as follows:
Database:
mysql> show variables like 'char%'  ;
+--------------------------+-----------------------------------------------+
| Variable_name            | Value                                         |
+--------------------------+-----------------------------------------------+
| character_set_client     | utf8                                          |
| character_set_connection | utf8                                          |
| character_set_database   | utf8                                          |
| character_set_filesystem | binary                                        |
| character_set_results    | utf8                                          |
| character_set_server     | utf8                                          |
| character_set_system     | utf8                                          |
| character_sets_dir       | /rdsdbbin/oscar-5.6.10a.31.53/share/charsets/ |
+--------------------------+-----------------------------------------------+
8 rows in set (0.01 sec)

ODBC Connection Parms
[DBVV]
#DSN put in place for test purposes - VV.
Driver =  /usr/lib64/libmyodbc5a.so
Description = ODBC driver for Aurora
SERVER = XXXXXX
PORT = 3306
DATABASE = XXXX
User = XXXX
Password = XXXXX
prefetch = 10
no_ssps = 1
CharSet = utf8
NO_BINARY_RESULT = 1
FOUND_ROWS = 1
LOG_QUERY = 1

Value Displayed in COBOL:
 EMP-BIN       :Â<80>
 EMP-BIN(hex)  :c280

Seems like there is conversion happening, for single byte binary values that aren't legitimate utf8 values. 

Please note, this works just fine for valid single byte utf8 values.

The variable character_set_server is latin1 by default, which I change in the program explicitly. 

What Has been Tried:
- Tried both ANSI and Unicode MySQL ODBC Connectors
- Various combination of ODBC Parms(CharSet ,NO_BINARY_RESULT) were tried.

Workaround Currently in Place:
- Explicit CAST of the column as BINARY in the SQL

Earliest possible response would be much appreciated. 

Thanks,
Vibin

How to repeat:
ODBC Driver and MySQL DB Setting as above. 
Pulling BINARY data from utf8 table.
[20 Dec 2017 10:10] MySQL Verification Team
Your server version is quite older, please try version 5.6.20 and comment the results to re-open this bug report after it done. Thanks.
[21 Jan 2018 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".