Bug #108271 | String gets wrong encoding while writing to DB | ||
---|---|---|---|
Submitted: | 24 Aug 2022 16:12 | Modified: | 9 Mar 2023 15:37 |
Reporter: | Jan-Peter Wilhelms | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | Connector / J | Severity: | S1 (Critical) |
Version: | 8.0.30 | OS: | Any |
Assigned to: | Filipe Silva | CPU Architecture: | Any |
[24 Aug 2022 16:12]
Jan-Peter Wilhelms
[18 Oct 2022 17:01]
Filipe Silva
Thank you for your interest in MySQL Connector/J and for reporting this issue. Can you please provide a test case showing up the issue?
[19 Nov 2022 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[21 Feb 2023 14:38]
Jan-Peter Wilhelms
Simple main demonstrating the issue
Attachment: MyMain.java (application/octet-stream, text), 1.63 KiB.
[27 Feb 2023 17:26]
Filipe Silva
Thank you Jan-Peter.
[4 Mar 2023 0:03]
Filipe Silva
Hi Jean-Peter, Note that the result of your test in Connector/J 8.0.28 wasn't correct either. You can verify that by switching to server-side prepared statements by adding `useServerPrepStmts=true` to the connection string. The result would be an error and that's the expected correct value in both cases. This is so because by doing `setCharacterStream()` you are not really sending a true string value to the server but instead a sequence of bytes that represents the string characters encoded in the character set of the client/JVM or in the character set specified in `clobCharacterEncoding`. The server would then take those bytes and decode them in the target column charset (Latin1, in your case), ignoring that they may not have been encoded in that charset to begin with. In order to make `setCharacterStream()` right you'd have to play with `clobCharacterEncoding` or `caracterEncoding` and make them align with the charset of the target column. There is a bug in this specific part though. I'll set this report as "verified" because of this. Thanks
[9 Mar 2023 15:37]
Daniel So
Added the following entry to the Connector/J 8.0.33 changelog: "For Connector/J 8.0.29 or later, client-side prepared statements failed with the error "Incorrect string value" when setCharacterStream() was used, if the JVM's character encoding did not match the target column's character set. It was because, in that case, the character encoding of the JVM was used when sending character streams to the server, which was unable to decode them as they were expected to be encoded with the column's character set. With this fix, the character set specified by either the connection property clobCharacterEncoding or characterEncoding is used instead, for both client-side and server-side prepared statements, which is the expected behavior." Also, Section 6.7, Using Character Sets and Unicode, of the MySQL Connector/J 8.0 Developer Guide (https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-reference-charsets.html) has been updated for more clarifications on the subject. The updates will be visible with the Connector/J 8.0.33 release.