Bug #104346 MySQL ODBC 8.0 Unicode Driver Fails to Return 4-Byte Emoji from Database
Submitted: 17 Jul 2021 7:26 Modified: 13 Sep 2021 4:15
Reporter: Stanislav Ogryzkov Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / ODBC Severity:S2 (Serious)
Version:8.0 OS:Windows
Assigned to: CPU Architecture:Any
Tags: 4-byte, 8.0, connector, driver, emoji, MySQL, ODBC, Unicode

[17 Jul 2021 7:26] Stanislav Ogryzkov
Description:
My classic ASP website uses MySQL ODBC 8.0 Unicode Driver to work with MySQL 8.0 and its utf8mb4 tables and experiences the same problems with 4-byte emoji as descibed at https://stackoverflow.com/questions/59010774/windows-odbc-driver-retrieved-emoji-data-rend... :-(

Particularly, I can WRITE such emoji into the database (using INSERT SQL statements) but canNOT read them back – an ADODB.RecordSet object returns question marks (?) instead of 4-byte emoji.

Can you somehow fix the driver so that it could properly give me such emoji via an ADODB.RecordSet object? Thanks.

How to repeat:
1. Connect to MySQL using the following connection string:

Driver={MySQL ODBC 8.0 Unicode Driver};Server=mysql.server.url;UId=user_id;Password=**********;Database=database_name;stmt=SET NAMES utf8mb4 COLLATE utf8mb4_general_ci;

2. Create a table like this:

CREATE TABLE `test` (
  `Name` varchar(30) COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

using ASP code like this:

Set Connection = Server.CreateObject("ADODB.Connection")
Connection.Open "Driver={MySQL ODBC 8.0 Unicode Driver}…"
Connection.Execute("CREATE TABLE…")
Connection.Close
Set Connection = Nothing

3. Write some emoji into it:

INSERT INTO `test` VALUES('😊');

using ASP code like this:

Set Connection = Server.CreateObject("ADODB.Connection")
Connection.Open "Driver={MySQL ODBC 8.0 Unicode Driver}…"
Connection.Execute("INSERT INTO…")
Connection.Close
Set Connection = Nothing

4. Check the written emoji in phpMyAdmin using the following SQL query:

SELECT HEX(`Name`) FROM `test`;

and see that it's correctly written as 4 bytes:

F09F988A

5. Try to read written emoji and show it using ASP code like this:

Set RecordSet = Server.CreateObject("ADODB.RecordSet")
SQLQuery = "SELECT `Name` FROM `test`;"
RecordSet.Open SQLQuery, "Driver={MySQL ODBC 8.0 Unicode Driver}…"
Response.Write(RecordSet("Name"))
RecordSet.Close
Set RecordSet = Nothing

6. See the output "?" instead of the expected "😊".

Suggested fix:
MySQL ODBC 8.0 Unicode Driver should return correct string values to ADODB.RecordSet even if the reponse contains 4-byte emoji.
[17 Aug 2021 12:37] MySQL Verification Team
Hello Stanislav Ogryzkov,

Thank you for the bug report and feedback.
Internally discussed with the Developer, we observed that you are using `STMT=...` which is not a valid option. INITSTMT is the valid but again SET NAMES is not allowed for INITSTMT.
Could you please try with CHARSET=UTF8MB4 to change the character set? Thank you.

Regards,
Ashwini Patil
[17 Aug 2021 13:53] Stanislav Ogryzkov
Ashwini,

> Could you please try with CHARSET=UTF8MB4 to change the character set?

Where should I try this? in the connection string? Then how will it look like?

Driver={MySQL ODBC 8.0 Unicode Driver};Server=mysql.server.url;UId=user_id;Password=**********;Database=database_name;stmt=SET NAMES utf8mb4 COLLATE utf8mb4_general_ci;CHARSET=UTF8MB4;

?
[18 Aug 2021 4:18] Bogdan Degtyariov
Hi Stanislav,

The CHARSET option needs to be added to the connection string as you indicated.
Please note that STMT=SET NAMES... is not a correct option for two reasons:

1. The name of the option should be INITSTMT instead of STMT
2. The driver does not allow SET NAMES ... in the initial statement.

However, in this present case CHARSET and INITSTMT will not help because there is a problem with the internal character set conversion of the string data inside the driver.

I am marking the bug as verified.

Also, this is not S1 but because it does not cause complete loss of service or system hanging indefinitely. Though, incorrect conversion of some (not all) UTF8MB4 characters is a serious problem. For that reason the severity is set to S2.
[1 Sep 2021 15:58] Philip Olson
Posted by developer:
 
Fixed as of the upcoming MySQL Connector/ODBC 8.0.27 release, and here's the proposed changelog entry from the documentation team:

Fixed the internal character set conversions of string data inside the
driver; some UTF8MB4 characters were not properly converted.

Thank you for the bug report.
[13 Sep 2021 4:15] Stanislav Ogryzkov
Philip,

> Fixed as of the upcoming MySQL Connector/ODBC 8.0.27 release…

Thank you! When will this release happen? I am still seeing 8.0.26 at https://dev.mysql.com/downloads/connector/odbc/