Bug #119284 Unicode characters encoding problem on Connector/ODBC >= 9.0
Submitted: 31 Oct 17:43 Modified: 31 Oct 20:16
Reporter: Joao Vitor Assmann Email Updates:
Status: Open Impact on me:
None 
Category:Connector / ODBC Severity:S3 (Non-critical)
Version:9.0 OS:Windows
Assigned to: CPU Architecture:Any
Tags: ADO, connector, ODBC, Unicode, windows

[31 Oct 17:43] Joao Vitor Assmann
Description:
When using Connector/ODBC Unicode version 9.0 and higher to connect to a MySQL Server through ADO in Windows and trying to read table names (for example) that contain non-ASCII characters, they are being encoded in Windows-1252 apparently.

The behavior is correct until Connector/ODBC 8.4 and problems started with version 9.0 up until the current version 9.5.

How to repeat:
Create a default new schema in MySQL (utf8mb4), and create a table with a non-ASCII character:

CREATE TABLE `mysqlbug`.`tabelaço` (
  `column` INT NOT NULL,
  PRIMARY KEY (`column`));

Now use this vbs script to connect to the database and list the tables through the OpenSchema command:

'=====================================
' list_tables.vbs
' Connect to MySQL via ODBC (MSDASQL) and list tables

Dim conn, rs
Set conn = CreateObject("ADODB.Connection")
Set rs   = CreateObject("ADODB.Recordset")

' === UPDATE THESE VALUES ===
strUser = "xxx"
strPass = "xxx"
' ===========================

' Connection string using MSDASQL (OLE DB for ODBC)
conn.Open "Provider=MSDASQL;Driver={MySQL ODBC 9.5 Unicode Driver};" & _
          "Server=localhost;Database=mysqlbug;UID=" & strUser & ";PWD=" & strPass & ";"

' Use OpenSchema to get table metadata
Set rs = conn.OpenSchema(20)  ' adSchemaTables = 20

Do Until rs.EOF
    ' Filter out system tables/views if desired
    If rs.Fields("TABLE_TYPE").Value = "TABLE" Then
        WScript.Echo rs.Fields("TABLE_NAME").Value
    End If
    rs.MoveNext
Loop

rs.Close
conn.Close

Set rs = Nothing
Set conn = Nothing

'=====================================

You will see in the output that the name will be output as 'tabelaço' with 'ç' being the Windows-1252 encoding of unicode 'ç'.

When using 8.4 driver you can see that the output is correct, as expected.
[31 Oct 20:07] Joao Vitor Assmann
This is against a 8.4.4 server by the way...
[31 Oct 20:16] Joao Vitor Assmann
Bug also reproduces against a 9.5 server.