Bug #30647 Error retrieving utf8 encoding text fields containing croatian characters
Submitted: 27 Aug 2007 17:38 Modified: 30 Aug 2007 19:00
Reporter: horvoje bob Email Updates:
Status: Duplicate Impact on me:
None 
Category:Connector / ODBC Severity:S3 (Non-critical)
Version:3.51.19/5.00.11 OS:Windows (XP Professional - Build 2600.xpsp_sp2_gdr.070227-2254 (SP2))
Assigned to: CPU Architecture:Any
Tags: connector, encoding, error, MySQL, ODBC, TEXT FIELD, utf8, varchar field

[27 Aug 2007 17:38] horvoje bob
Description:
I created UTF-8 MySQL table made of two fields, varchar and text, both containing identical croatian characters.
Then I linked this table into MS-Access using MySQL ODBC connector.
I expected to see all characters as I typed them, but I got funny results:

- Using connector 3.51.19 I got question-marks instead of almost all croatian characters in both fields, varchar and text.

- Using connector 5.00.11 I got correct characters in varchar field, but funny characters in text field.

How to repeat:
Create single MySQL table with two fields, varchar and text field.
Put some croatian characters in those fields.
Link this table in MS-Access database.
When you open linked table in MS-Access, you should get funny results.

Suggested fix:
Connector in version 5.00.11 works correctly with varchar fields.
[27 Aug 2007 17:41] horvoje bob
The SQL script to create test tables.

Attachment: hr.sql (text/x-sql), 1.18 KiB.

[27 Aug 2007 17:42] horvoje bob
The database created in MySQL Front - croatian characters are correctly displayed.

Attachment: img_1.png (image/png, text), 10.36 KiB.

[27 Aug 2007 17:43] horvoje bob
Tables linked in MS-Access - you can see croatian characters transformed into something unreadable.

Attachment: img_2.png (image/png, text), 15.20 KiB.

[28 Aug 2007 8:55] Tonci Grgin
Hi Hrvoje and thanks for your report.

- Using connector 3.51.19 I got question-marks instead of almost all Croatian characters in both fields, varchar and text.

This is a known limitation in 3.51 branch and we're working actively on fixing it.

- Using connector 5.00.11 I got correct characters in varchar field, but funny characters in text field.

This is a known problem in 5.0 branch already reported (Bug#28617). As 5.0 is beta please concentrate on 3.51.

- MS Access: Access uses UTF16 (as windows does) by default so you should expect problems there regardless of MySQL connector used.

Now we are left with Croatian characters and MySQL. In my experience of using MySQL for about 8 years in business systems, this just works. The main problem is illegal / incorrect mixture of csets. You have cp1250 database and UTF8 table in it. Personally, I always used latin1 / latin_swedish_ci in my projects as all characters are mapped correctly and most of win clients can represent our characters. Back then, in days of MySQL server 3.23, there was no Croatian support anyway. Do you really need UTF8?
Using MyODBC 3.51.19 on your script (I just removed default charset=utf8 from ) ENGINE=InnoDB...!) I get correct results in MS generic ODBC client (see attached image).

As for report in general, it brings nothing new, just known problems, so I'll set it to "Duplicate". If you have any more questions please feel free to ask.
[28 Aug 2007 8:56] Tonci Grgin
Generic MS ODBC client odbcte32.exe producing correct results

Attachment: bug30647.jpg (image/jpeg, text), 111.95 KiB.