Bug #98136 TEXT err with utf8mb4_unicode_520_nopad_ci starting mysql-connector-java 5.1.41
Submitted: 7 Jan 2020 0:01 Modified: 3 Sep 2021 11:04
Reporter: Livio Cavallo Email Updates:
Status: Not a Bug Impact on me:
None 
Category:Connector / J Severity:S2 (Serious)
Version:5.1.41, 8.0.18 OS:Windows (Win7/8/10)
Assigned to: CPU Architecture:Other (x64)
Tags: jdbc, mysql-connector-java, nopad, text, Unicode, utf8mb4_unicode_520_nopad_ci

[7 Jan 2020 0:01] Livio Cavallo
Description:
If you store TEXT data with any accented vowuel, for instance à, in a table with collation utf8mb4_unicode_520_nopad_ci, the text is stored regularly (you can see it correctly with phpmyadmin) but if you read that same data back now that vowel if read as Ã.

Tested with mysql-connector-java 5.1.41, 5.1.42, 5.1.43, 5.1.48, 6.0.6, 8..0.7-dmr, 8.0.17, 8.0.18.

The problem is present in all JDK and JRE tested: Oracle JDK13, JDK J1.8 (Zulu Community), Java(TM) SE Runtime Environment (build 1.8.0_231-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.231-b11, mixed mode).

I am connecting to 10.2.30-MariaDB - MariaDB Server. I think the same problem will arise with different mySql version.

How to repeat:
- Create a table in a mySQL DB (10.2.30-MariaDB - MariaDB Server) with a TEXT column with collation and charset utf8mb4_unicode_520_nopad_ci

- Insert a recordset containing 'à' in that column.

- Connect to DB via java jdbc, using mysql-connector-java version 5.1.41.

- Read that recordset

- that column will show in Java as a wrong 'Ã' instead of the correct 'à'

Suggested fix:
The problem can be avoided in two ways:
- using PAD charsets and collations (utf8mb4_unicode_520_ci)
or
- using mysql-connector-java version previous to 5.1.41. I tested with success with ver. 5.1.40 and 5.1.38.

These are not fixes; these are workarounds.

I do not know how to really fix this problem.
[7 Jan 2020 8:41] MySQL Verification Team
Hello Livio Cavallo,

Thank you for the report.

regards,
Umesh
[24 Feb 2020 8:57] Livio Cavallo
I detected the same problem in win 7, 8 and 10
[7 May 2021 11:37] Livio Cavallo
Any progress about this problem?
[22 Jul 2021 14:26] Alexander Soklakov
Hi Livio,

C/J has no static mapping for utf8mb4_unicode_520_nopad_ci collation because it's not a MySQL supported one. Did you try to use c/J connection property detectCustomCollations=true?
[23 Aug 2021 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[3 Sep 2021 11:04] Alexander Soklakov
Posted by developer:
 
Closing as not a bug.
Please set detectCustomCollations=true if using non-MySQL collations (see https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-connp-props-connection.html#cj-co...).
Also, with cacheServerConfiguration=true detection results will be preserved between connections (see https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-connp-props-performance-extension...).