Bug #2395 utf8 chars are unsupported
Submitted: 15 Jan 2004 0:53 Modified: 22 Jan 2004 11:24
Reporter: Jakub Strychowski Email Updates:
Status: Can't repeat Impact on me:
None 
Category:Connector / J Severity:S2 (Serious)
Version:3.0.6, 3.0.8, 3.0.9, 3.0.10, 3.1.0 OS:Microsoft Windows (Windows 2000)
Assigned to: Mark Matthews CPU Architecture:Any

[15 Jan 2004 0:53] Jakub Strychowski
Description:
Hi!

I can't send/recive texts in the utf8 format through the MySQL Connector/J. All not latin1 chars are converted to the ? char (or other wrong chars). I can insert and query varchars in the utf8 format in the MySQL Control Center. My Java application also runs properly in the Linux enviroment - I can store arabic, english, greek, polish, russian chars in a one table cell. In a Windows enviroment I can't. I tried many combinations of: mySQL 4.1.0, 4.1.1a, and Connector/J 3.0.6, 3.0.8, 3.0.9, 3.0.10, 3.1.x. I'm using j2sdk 1.4.2.
I added following parameters to connection url: 
"?useUnicode=true&characterEncoding=UTF8" but this dosen't help.

mySQL server is started using --default-character-set=utf8. I attached my server configuration variables below:

# Server Variables
# Connection: localhost
# Host: localhost
# Saved: 2004-01-15 08:11:34
# 
'Property','Value'
'back_log','50'
'basedir','C:\mysql\'
'bdb_cache_size','8388600'
'bdb_home','C:\mysql\data\'
'bdb_log_buffer_size','32768'
'bdb_max_lock','10000'
'bdb_shared_data','OFF'
'bdb_tmpdir','C:\DOCUME~1\ADMINI~1.000\LOCALS~1\Temp\'
'binlog_cache_size','32768'
'bulk_insert_buffer_size','8388608'
'character_set_client','utf8'
'character_set_connection','utf8'
'character_set_database','utf8'
'character_set_results','utf8'
'character_set_server','utf8'
'character_set_system','utf8'
'character-sets-dir','C:\mysql\share\charsets/'
'collation_connection','utf8_general_ci'
'collation_database','utf8_general_ci'
'collation_server','utf8_general_ci'
'concurrent_insert','ON'
'connect_timeout','5'
'datadir','C:\mysql\data\'
'date_format','%Y-%m-%d'
'datetime_format','%Y-%m-%d %H:%i:%s'
'default_week_format','0'
'delay_key_write','ON'
'delayed_insert_limit','100'
'delayed_insert_timeout','300'
'delayed_queue_size','1000'
'expire_logs_days','0'
'flush','OFF'
'flush_time','1800'
'ft_boolean_syntax','+ -><()~*:""&|'
'ft_max_word_len','84'
'ft_min_word_len','4'
'ft_query_expansion_limit','20'
'ft_stopword_file','(built-in)'
'have_bdb','YES'
'have_compress','YES'
'have_crypt','NO'
'have_innodb','YES'
'have_isam','NO'
'have_openssl','NO'
'have_query_cache','YES'
'have_raid','NO'
'have_symlink','YES'
'innodb_additional_mem_pool_size','1048576'
'innodb_buffer_pool_awe_mem_mb','0'
'innodb_buffer_pool_size','8388608'
'innodb_data_file_path','ibdata1:10M:autoextend'
'innodb_fast_shutdown','ON'
'innodb_file_io_threads','4'
'innodb_file_per_table','OFF'
'innodb_flush_log_at_trx_commit','1'
'innodb_force_recovery','0'
'innodb_lock_wait_timeout','50'
'innodb_log_arch_dir','.\'
'innodb_log_archive','OFF'
'innodb_log_buffer_size','1048576'
'innodb_log_file_size','5242880'
'innodb_log_files_in_group','2'
'innodb_log_group_home_dir','.\'
'innodb_max_dirty_pages_pct','90'
'innodb_mirrored_log_groups','1'
'innodb_open_files','300'
'innodb_thread_concurrency','8'
'interactive_timeout','28800'
'join_buffer_size','131072'
'key_buffer_size','8388600'
'key_cache_age_threshold','300'
'key_cache_block_size','1024'
'key_cache_division_limit','100'
'language','C:\mysql\share\english\'
'large_files_support','ON'
'local_infile','ON'
'log','OFF'
'log_bin','OFF'
'log_error','.\pcs32.err'
'log_slave_updates','OFF'
'log_slow_queries','OFF'
'log_update','OFF'
'log_warnings','OFF'
'long_query_time','10'
'low_priority_updates','OFF'
'lower_case_table_names','ON'
'max_allowed_packet','1048576'
'max_binlog_cache_size','4294967295'
'max_binlog_size','1073741824'
'max_connect_errors','10'
'max_connections','100'
'max_delayed_threads','20'
'max_error_count','64'
'max_heap_table_size','16777216'
'max_join_size','4294967295'
'max_length_for_sort_data','1024'
'max_relay_log_size','0'
'max_seeks_for_key','4294967295'
'max_sort_length','1024'
'max_tmp_tables','32'
'max_user_connections','0'
'max_write_lock_count','4294967295'
'myisam_max_extra_sort_file_size','268435456'
'myisam_max_sort_file_size','2147483647'
'myisam_recover_options','OFF'
'myisam_repair_threads','1'
'myisam_sort_buffer_size','8388608'
'named_pipe','OFF'
'net_buffer_length','16384'
'net_read_timeout','30'
'net_retry_count','10'
'net_write_timeout','60'
'new','OFF'
'old_passwords','OFF'
'open_files_limit','0'
'pid_file','C:\mysql\data\pcs32.pid'
'port','3306'
'preload_buffer_size','32768'
'protocol_version','10'
'pseudo_thread_id','0'
'query_alloc_block_size','8192'
'query_cache_limit','1048576'
'query_cache_min_res_unit','4096'
'query_cache_size','0'
'query_cache_type','ON'
'query_prealloc_size','8192'
'range_alloc_block_size','2048'
'read_buffer_size','131072'
'read_only','OFF'
'read_only','OFF'
'read_rnd_buffer_size','262144'
'relay_log_purge','ON'
'rpl_recovery_rank','0'
'secure_auth','OFF'
'server_id','0'
'shared_memory','OFF'
'shared_memory_base_name','MYSQL'
'skip_external_locking','ON'
'skip_networking','OFF'
'skip_show_database','OFF'
'slave_net_timeout','3600'
'slow_launch_time','2'
'sort_buffer_size','2097144'
'storage_engine','MyISAM'
'table_cache','64'
'table_type','MyISAM'
'thread_cache_size','0'
'thread_stack','196608'
'time_format','%H:%i:%s'
'timezone','Pacific Standard Time'
'tmp_table_size','33554432'
'transaction_alloc_block_size','8192'
'transaction_prealloc_size','4096'
'tx_isolation','REPEATABLE-READ'
'version','4.1.1a-alpha-max-nt'
'version_bdb','Sleepycat Software: Berkeley DB 4.1.24: (December 19, 2003)'
'version_comment','Source distribution'
'version_compile_machine','i32'
'version_compile_os','NT'
'wait_timeout','28800'

How to repeat:
Write any program in a java language sending or reciving utf8 varchars from the mySQL database. Use mySQL Connector/J as a JDBC driver. Run this programm in the Windows operating systems and check returned strings.

Suggested fix:
JDBC driver dosn't recognize character set in the Windows.
[22 Jan 2004 11:24] Mark Matthews
The exact test you speak of is in the testsuite, and this issue is not repeatable there. 

Please provide a Java testcase that uses the _exact_ characters you are losing.