Bug #31279 SELECT DISTINCT CAST AS CHAR truncates results
Submitted: 28 Sep 2007 13:59 Modified: 12 Jun 2008 13:07
Reporter: Daniel Gemeinert Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Charsets Severity:S2 (Serious)
Version:5.0.51a OS:Windows (XP SP3, Server 2003 WE & EE)
Assigned to: CPU Architecture:Any
Tags: bug, cast, char, distinct, error, SQL

[28 Sep 2007 13:59] Daniel Gemeinert
Description:
Assume you have a table with an integer column f.e. table "cds" from "cdcol" sample database.

Executing the following statement truncates result to 3 chars:

SELECT DISTINCT CAST('jahr' as CHAR) FROM 'cds'

--> replace 'jahr' with 'year' in english version

This problem does not appear on unix based mysql database server.

It did not occur just 1 Month ago. There might have been a windows update in the last 4 weeks that causes this problem.

The problem also does not occur when executing "SELECT CAST('jahr' as CHAR) FROM 'cds'" or "SELECT DISTINCT CAST('jahr' as CHAR(4)) FROM 'cds'"

How to repeat:
Execute "SELECT DISTINCT CAST('jahr' as CHAR) FROM 'cds'" on "cdcol" - sample database on windows based mysql database server.
[28 Sep 2007 14:06] Rene MT
I'm able to reproduce this behavior.

Regards,
ReneMT
[28 Sep 2007 14:07] MySQL Verification Team
Thank you for the bug report. Could you please upgrade for the latest released
version, the version reported is pretty older. If the behavior reported is still
present in the newest version please provide a complete test script. Thanks in
advance.
[28 Sep 2007 14:28] Rene MT
Even with the latest version included in XAMPP(5.0.45-community-nt) I'm still able to reproduce this.
[28 Sep 2007 15:09] Daniel Gemeinert
I want to describe our scenario in more detail:

There are 3 mysql database servers in use:

Server#1: Server Version: 5.0.15 running on Linux
Server#2: Server Version: 5.0.41-community-nt running on MS Windows Server 2003 Web Edition sp2
Server#3: Server Version: 5.0.45-community-nt running on MS Windows XP sp2

Connecting to server#1 using PhpMyadmin (remote) or MySQL Connector.Net 5.1.2 (remote) does not cause any problems.

Connecting to Server#2 or Server#3 using PhpMyadmin (local & remote) or MySql Connector.Net 5.1.2 (local [Windows app] & remote [Windows app & ASP.NET app]) causes the problem, EXCEPT:

It is an ASP.NET website hosted on IIS on Server#2 connecting to Server#2´s own MySQL database using MySQL Connector.Net 5.1.2. In this case, the problem does not occur.
[18 Oct 2007 10:23] Sveta Smirnova
Thank you for the feedback.

Please run same query for server #3 using mysql command line client and if problem still exists provide output of SHOW VARIABLES LIKE 'char%' and SHOW VARIABLES LIKE 'collation%'
[18 Oct 2007 10:51] Rene MT
As I'm working with Daniel on the same project I tested it with the command line client as proposed. The problem does not occur in this case, the year is not truncated.
[18 Oct 2007 11:03] Sveta Smirnova
Rene,

thank you for the feedback.

In this case problem looks like not MySQL, but PHPMyAdmin or Connector/NET application issue.

> It is an ASP.NET website hosted on IIS on Server#2 connecting to Server#2´s own MySQL
> database using MySQL Connector.Net 5.1.2. In this case, the problem does not occur.

This shows problem is not Connector/NET, but application which uses it.

So I'll close the report as "Not a Bug".
[18 Oct 2007 11:19] Rene MT
To me, the fact that phpMyAdmin as well as applications using Connector/NET are affected seems strange.

However - I also did report this error at the phpMyAdmin project at http://sourceforge.net/tracker/index.php?func=detail&aid=1815699&group_id=23067&atid=37740...
[9 Jun 2008 8:20] Daniel Gemeinert
The problem described above still exists and should have something to do with utf8 character settings.

Here are the results of SHOW VARIABLES LIKE 'char%':
character_set_client         utf8
character_set_connection     utf8
character_set_database       utf8
character_set_filesystem     binary
character_set_results        utf8
character_set_server         utf8
character_set_system         utf8
character_sets_dir           C:\xampp\mysql\share\charsets\

SHOW VARIABLES LIKE 'collation%':
collation_connection   utf8_general_ci
collation_database     utf8_unicode_ci
collation_server       utf8_general_ci

We are now using xampp 1.6.6a (MySQL 5.0.51a). MySQL character setting is UTF-8 Unicode (utf8). Character setting / collation of the connection is utf8_unicode_ci. We have created a new database including one table with collation utf8_unicode_ci. There is an integer field in this table having values like 12345, but "SELECT DISTINCT CAST(intfield as CHAR) FROM utf8testtable" returns 123 (only 3 characters).
[12 Jun 2008 13:07] Susanne Ebrecht
Daniel,

many thanks for your feedback.

This is no bug.

When you use Windows, you have to figure out first, which encoding your Windows is using.

Usually a German Windows is using code page 850.

You can figure it out by typing: CHCP in the command line.

When your Windows is using cp850 then you have to use "set names cp850" first, before you type something.

SET NAMES will set character_set_client, character_set_result, character_set_connection and some more necessary values.

For inserting data you have to set character_set_client (and eventually character_set_connection) to the encoding that your client is using. If the client is Windows and Windows is using CP850 then its CP850.

Only when you set character_set_client to the encoding that you are using for inserting data then the system can automatically and transparent convert it to UTF8 and store right in your database.

The same for output data. Here you need character_set_result too. Only when the system knows to which character set it has to decode the values it can display it right.

If your data are stored right in the database you can check with a very easy test:

Your column of the table is UTF8:
Now insert a single 'ä' or 'ü' or 'ß' or 'ö' into it.
then SELECT LENGTH(columnname) from yourtable;
If the result is 2 then its stored right. Because in UTF8 the German Umlauts need 2 bytes.
By using latin1 is just needs 1 bytes.

So if the result is 1 you have stored the data wrong in your database.

One big advantage from PHP is that it ignores character sets but this also means that you have to know what you are doing here. You have to fill in your data by using the right encoding. Usually this is the encoding that your browser/system is using.
[12 Jun 2008 13:42] Sebastian Mendel
this is a very common error, people often do not know that they should execute SET NAMES first

http://bugs.mysql.com/37367