Bug #29367 Encoding issue with ODBC Connector 3.51.16 on IIS/ASP pages
Submitted: 26 Jun 2007 17:27 Modified: 28 Sep 2007 12:07
Reporter: Alexandre Madurell Email Updates:
Status: Duplicate Impact on me:
None 
Category:Connector / ODBC Severity:S2 (Serious)
Version:3.51.16 OS:Windows (XP Professional IIS5.0)
Assigned to: CPU Architecture:Any
Tags: ASP, connector, IIS, utf8

[26 Jun 2007 17:27] Alexandre Madurell
Description:
The exact same ASP code running on two computers differing only in the ODBC connector version result in different encoding when retrieving data with a "SELECT" instruction in ASP.

The computer running ODBC Connector 3.51.12 retrieves utf8-encoded data correctly.
The computer running ODBC Connector 3.51.16 retrieves utf8-encoded data incorrectly (it attempts to display it as latin characters).

The data origin is the same in both cases (this is not a my.ini issue or a MySQL Community Server version at all).

How to repeat:
On Windows XP with IIS and MySQL ODBC Connector 3.51.16 installed:

1. Create an ASP page like:

<%@LANGUAGE="VBSCRIPT" CODEPAGE="65001"%>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="es" lang="es">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <title>Encoding test</title>
</head>
<body>
<%
    Session.CodePage=65001
    Response.Charset="utf-8"

    Connection_str= "Driver={mySQL ODBC 3.51 Driver}; Server=aMySQLServer; Port=3306; Database=aDatabase; Uid=aUser; Pwd=aPassword; Option=3;"

    query_str="SELECT aVarCharField FROM aTable"
    set RS = Server.CreateObject("ADODB.Recordset")
	RS.ActiveConnection = Connection_str
	RS.Source = query_str
	RS.CursorType = 3
	RS.CursorLocation = 2
	RS.LockType = 3
	RS.Open()
    Response.Write RS("aVarCharField")
%>
</body>
</html>

Make sure the table has been created with utf8 encoding and that the VARCHAR field contains some text with accents (á, é...) or tilde (ã, õ...).

When accessing the ASP page from the browser, the "odd" characters will be shown as little square boxes.

Now, uninstall ODBC Connector 3.51.16. Reboot the computer. Install ODBC Connector 3.51.12.

Browse for the same page again. Now the characters are properly displayed.

Suggested fix:
I tried a gazillion things before this (opening the connection and executing a "SET NAMES utf8" before opening the recordset, modifying the my.ini so as to have everything in utf8 by default, etc). 

The only thing that worked was switching back to the 3.51.12 Connector.
[5 Jul 2007 10:01] Alexandre Madurell
Update: I was still having problems with russian text on 3.51.12, so I tried once again the 3.51.16. All other text seemed to work fine this time, except the russian one.

Another curious thing: With either one version, if I set a LongText field with a utf8_bin collation, the utf8 data stored in the table (containing german or french characters which I can see and edit perfectly in MySQL Query Browser) is displayed as chinese characters in the asp page (even the regular characters!).

Switching collation to utf8_unicode_ci allows the text to be displayed properly.

Still, the russian characters don't appear as chinese ones, but as question marks instead, no matter what collation I use.

Any advise is very welcome.
[26 Jul 2007 10:13] Tonci Grgin
Hi Alexandre and thanks for your report.

Can you please do the following:
 - Attach my.ini file
 - Attach general query log at the time update / insert of data occurs
 - If possible, attach table files with data that is displayed wrongly (frm, myi and myd)
 - If not, please attach DDL script with table structure and some data
 - ODBC DM trace might help too

I suspect there is a mixup with charsets and collations somewhere. You can also try new MyODBC 3.51.17.
[26 Aug 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[29 Aug 2007 10:37] Sveta Smirnova
Russian Cyrillic in UTF8 table and MyODBC work fine for me.
[29 Aug 2007 10:51] Sveta Smirnova
test case

Attachment: bug29367.c (text/plain), 1.99 KiB.

[28 Sep 2007 10:05] Tonci Grgin
This is probably a duplicate of Bug#13465. Will consult Bogdan.
[28 Sep 2007 11:56] Bogdan Degtyariov
It looks as the problem is in ADO library. The problem appeared since SQLDescribeCol is fixed (now it returns the column size in characters instead of bytes).
Here are the details:

CREATE TABLE `companies` (
`Rdid` varchar(40) NOT NULL default '',
`MktgContactTitle` varchar(40) default NULL,
PRIMARY KEY (`Rdid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

both varchar(40) columns are multibyte columns, SELECT LENGTH(MktgContactTitle) returns 80 bytes.
When doing SELECT from the table ADO checks the size of the column in characters (not bytes) for SQLDescribeCol.
Everything is ok so far.
Next call is confusing: the application has to pass the size of the buffer in bytes for SQLBindCol()!
But the buffer size is 40 (maybe application considers that it is in chars) instead of 80 bytes.

Here you can see that the buffer length is in bytes:
http://msdn2.microsoft.com/en-us/library//ms711010.aspx
[28 Sep 2007 12:07] Tonci Grgin
After much consideration, I've decided to mark this bug a duplicate of Bug#13465.