MySQL Bugs: #97226: Incorrect charset comparison to 'UTF-8' when reading a longtext field as NClob

Bug #97226	Incorrect charset comparison to 'UTF-8' when reading a longtext field as NClob
Submitted:	15 Oct 2019 9:36	Modified:	16 Nov 2019 11:36
Reporter:	Oleksandr Shkurat	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	Connector / J	Severity:	S2 (Serious)
Version:	8.0.17	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	charset, NClob, UTF-8

Description:
My application works with MySQL DB using Spring Data JPA 2.1.11 (Spring boot 2.19), which uses Hibernate 5.3.13 and MySQL Connector 8.0.17.
I've declared an entity with a field annotated with
    @Type(type = "materialized_nclob")
So that, there was created a table with a field of type 'longtext' type. The DB, table and this field are marked as characterset=utf8, collation=utf8_general_ci.
Now I've got a problem with a reading of this field from DB.
The problem occurs inside the class `com.mysql.cj.jdbc.result.ResultSetImpl` in its method `public NClob getNClob(int columnIndex)`
There is an exception with message `Can not call getNClob() when field's charset isn't UTF-8`

How to repeat:
Save some string value to a `longtext` field and then read it

Suggested fix:
Looks like `UTF-8` is non-unique identifier for UFT8 codepage. 
That's why the code:
`            if (fieldEncoding != null && fieldEncoding.equals("UTF-8")) {
                String asString = this.getStringForNClob(columnIndex);
                return asString == null ? null : new com.mysql.cj.jdbc.NClob(asString, this.getExceptionInterceptor());
            } else {
                throw new SQLException("Can not call getNClob() when field's charset isn't UTF-8");
            }`
doesn't work correctly, because in my case fieldEncoding has the value 'utf8'.
Possibly there can be implemented a set of valid identifiers for utf8. For example:
Sets.newHashSet("UTF-8", "utf8")

Screenshot from IDEA on breakpoint

Attachment: NClob_Issue.png (image/png, text), 159.30 KiB.

Hi Oleksandr,

I can not reproduce this error. Please provide the exact table definition, c/J connection string and the code snippet how you create the ResultSet. It would be great if you can provide a reproducible test case.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".