Bug #76889 | Setting utf8mb4 character encoding | ||
---|---|---|---|
Submitted: | 29 Apr 2015 14:57 | Modified: | 24 Jan 2022 14:49 |
Reporter: | Vyacheslav Gerasymenko | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | Connector / J | Severity: | S4 (Feature request) |
Version: | 5.1.34 | OS: | Any |
Assigned to: | Filipe Silva | CPU Architecture: | Any |
Tags: | character set, utf8mb4 |
[29 Apr 2015 14:57]
Vyacheslav Gerasymenko
[29 Apr 2015 15:01]
Vyacheslav Gerasymenko
"How to repeat" block was trunckated, since this engine failed to process sample emoji characters (support for full UTF-8 character set is required to process them). Copied them here: http://pastebin.com/pmrFPR2X
[5 May 2015 18:53]
Filipe Silva
Hi Vyacheslav, Thank you for this feature request. We are analyzing its viability.
[26 Jun 2015 16:14]
Filipe Silva
Hi, Your request is perfectly acceptable. You can, however, achieve the same results by issuing a "SET NAMES utf8mb4" right after establishing the connection. Please let us know if this works for you at the moment. Thank you,
[2 Jul 2015 16:57]
Vyacheslav Gerasymenko
Hi! Yes, same result can be achieved by separate execution "SET NAMES utf8mb4" after connection establishment and yes, it works for me so, but this approach has two drawbacks: 1. Some performance degradation, since in every connection to MySQL server two initialization statements will be executed: "SET NAMES utf8" (in configureClientCharacterSet method of ConnectionImpl class) and "SET NAMES utf8mb4" (executed manually after connection establishment). 2. Manually executing "SET NAMES utf8mb4" statement after opening connection and before every regular statement such as select, update, insert, which work with UTF-8 strings is not very usable approach. So, it would be handy and more efficient to set utf8mb4 character set globally - via additional connection property, which can be configured in code, config files or JDBC resource of Java EE App Server just once, by programmer or administrator.
[1 Dec 2015 11:35]
Bora Erbas
Hi Filipe, I am not sure if "SET NAMES UTF8MB4" would work. The MySQL Connector/J documentation explicitly states below: https://dev.mysql.com/doc/connector-j/en/connector-j-reference-charsets.html Warning Do not issue the query set names with Connector/J, as the driver will not detect that the character set has changed, and will continue to use the character set detected during the initial connection setup. So if the SET NAMES call would work as you suggested; then the above documentation is inaccurate? Or am I missing something? I tested this and here is what I get. I need to be able to insert emoticons to a MySQL table so I need utf8mb4. I am using EclipseLink, JPA btw. When I run "SET NAMES UTF8MB4" query after getting the connection, it still does not work if the character_set_server is currently set to latin1 on the server. But if the character_set_server is currently set to utf8 (not utf8mb4) it works. Any pointers are appreciated. Regards, Bora.
[4 May 2016 13:34]
Chiranjeevi Battula
Marking as duplicate of Bug#81196
[24 Jan 2022 14:49]
Alexander Soklakov
Posted by developer: Connector/J 5.1 series came to EOL on Feb 9th, 2021, see https://www.mysql.com/support/eol-notice.html, so this bug will not be fixed there. Character sets support was significantly reworked in Connector/J 8.0, please check the documentation https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-reference-charsets.html, you could use "connectionCollation" instead of "characterEncoding" to set utf8 vs utf8mb4 connection charset.