Bug #12789 Unicode charcaters show up as ?
Submitted: 24 Aug 2005 18:10 Modified: 25 Aug 2005 0:36
Reporter: Saikat Kanjilal
Status: Not a Bug
Category:Connector/J Severity:S2 (Serious)
Version:My SQL 4.0.17 OS:Sun Solaris (Solaris)
Assigned to: Target Version:

[24 Aug 2005 18:10] Saikat Kanjilal
Description:
We are running into a serious issue in our java application when migrating data from the
mysql database into our oracle database when displaying bullets or registered trademarks
or other special symbols.  All of these symbols show up as question marks in our java
application that is looking up this data in our oracle database.  It turns out that the
source mysql database has the encoding set to Latin1 and when we are connecting I add the
&useUnicode=true&characterEncoding=UTF-8 in our jdbc connection parameter.  However the
special characters still show up as question marks when viewed from our java application.
 Our java application is retrieving this data from our oracle database as UTF8.   Please
let me know if this is a known issue or if there are additional parameters in mysql that
we need to change, I even tried to set the characterEncoding to latin1 so that we match
the source database encoding but that didnt make any difference.

How to repeat:
1) set the encoding of the mysql db to latin1
2) Log into mysql versiion 4.0.17 and create a sample table with one column thats a
varchar
3) insert a string containing bullets/registered trademark symbols or other special
characters
4) write a small java program to retrieve and display this information in a gui using
jdbc with &useUnicode=true&characterEncoding=UTF-8 added to the jdbc connection url
5) verify what the characters look like in the display

Suggested fix:
All characters need to come across correctly as bullets or registered trademark signs etc
[25 Aug 2005 0:36] Mark Matthews
MySQL-4.0 doesn't natively support UTF-8 as a character set. It just happens to "work" for
most cases, since UTF-8 is a superset of latin1, but we make no guarantees that it works
for all situations.

If you need to use UTF-8, you should upgrade to MySQL-4.1.