Bug #73747 | Add a config option to ignore non-BMP characters | ||
---|---|---|---|
Submitted: | 28 Aug 2014 3:26 | Modified: | 29 Aug 2014 15:00 |
Reporter: | Eric Hartford | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | Connector / J | Severity: | S3 (Non-critical) |
Version: | OS: | Any | |
Assigned to: | CPU Architecture: | Any |
[28 Aug 2014 3:26]
Eric Hartford
[28 Aug 2014 14:58]
Todd Farmer
Hi Eric, Thanks for the bug report. Do you have a simple repeatable test case you can provide? It would also be useful to provide the full stack trace for the Exception you are getting. Have you tried setting jdbcCompliantTruncation=false? By default, Connector/J sets SQL_MODE=STRICT_TRANS_TABLES, which causes data truncation to generate errors on the server. If that is the trigger of the Exception, jdbcCompliantTruncation=false should suppress it (although the server-side truncation behavior may not do what you are wanting).
[28 Aug 2014 18:39]
Eric Hartford
ok here is a test case. Please change the connection string to have the appropriate credentials. Please ensure Connector/J is in the classpath. test.sql create database test; use test; create table t (c char(20) character set utf8); test.java import java.sql.Connection; import java.sql.DriverManager; import java.sql.SQLException; import java.sql.Statement; public class test { public static void main(String[] args) { Connection conn = null; Statement stmt = null; try { conn = DriverManager.getConnection("jdbc:mysql://localhost/test?" + "user=x&password=x&useUnicode=true&characterEncoding=UTF-8"); stmt = conn.createStatement(); stmt.executeUpdate("insert into t (c) values ('Hello
[28 Aug 2014 18:40]
Eric Hartford
looks like this very comment field cannot support 4-byte characters, it truncated my message.
[28 Aug 2014 18:41]
Eric Hartford
ok here is a test case. Please change the connection string to have the appropriate credentials. Please ensure Connector/J is in the classpath. test.sql create database test; use test; create table t (c char(20) character set utf8); test.java import java.sql.Connection; import java.sql.DriverManager; import java.sql.SQLException; import java.sql.Statement; public class test { public static void main(String[] args) { Connection conn = null; Statement stmt = null; try { conn = DriverManager.getConnection("jdbc:mysql://localhost/test?" + "user=x&password=x&useUnicode=true&characterEncoding=UTF-8"); stmt = conn.createStatement(); stmt.executeUpdate("insert into t (c) values ('Hello {insert emoji here} dear')"); } catch (SQLException ex) { System.out.println("SQLException: " + ex.getMessage()); System.out.println("SQLState: " + ex.getSQLState()); System.out.println("VendorError: " + ex.getErrorCode()); } } } This will throw an exception something like: SQLException: Incorrect string value: '\xF0\x9F\x98\x8A\x0A[...' for column 'c' at row 1 I would like to have a setting that will strip the 4-byte characters. So the insert would succeed, but the row would contain "Hello dear" instead of "Hello {insert emoji here} dear"
[28 Aug 2014 18:45]
Eric Hartford
If you use mac, type an emoji with ctrl-command-space. if you use windows 8.1, use the soft keyboard to type a smily character.
[28 Aug 2014 18:47]
Todd Farmer
Hi Eric, That would be the exact truncating behavior of running without STRICT_TRANS_TABLES enabled I was mentioning. :) What you are observing is a server-related behavior (choose between truncation or error, depending on SQL_MODE - but the option to ignore specific bytes is not supported). It's possible to do some pre-processing on the Connector/J side (you could do something using statement interceptors, for example). That would probably be the better approach than to have Connector/J inspect and replace, character-by-character, out-of-range characters for every command.
[28 Aug 2014 19:00]
Eric Hartford
ok i will look into the statement interceptors. thank you for the information.