Bug #13233 select distinct char(column) fails with utf8
Submitted: 15 Sep 2005 18:51 Modified: 23 Oct 2005 2:46
Reporter: Richard Wesley
Status: Closed
Category:Server Severity:S2 (Serious)
Version:4.1.14-nt, 4.1.10-standard OS:Microsoft Windows (Windows/Linux)
Assigned to: Alexander Barkov Target Version:

[15 Sep 2005 18:51] Richard Wesley
Description:
The results of a select distinct query involving a CHAR() function on a column depends on
the character set of the connection.  Connecting from a mysql client under Linux gives
correct answers (as determined by comparing the results to similar queries under SQL
Server, Jet and Oracle), but if you SET NAMES to utf8, only two distinct values are
returned, and they are NULL and some sort of blank string.  This happens both from the
mysql command line tool under Linux and using the Connector/ODBC 3.51.11 tool.

How to repeat:
1. Create a table called 'Calcs' with a single column:

+--------+
| num4   |
+--------+
|   NULL |
|  10.85 |
| -13.47 |
|  -6.05 |
|   8.32 |
|  10.71 |
|   NULL |
| -10.24 |
|   4.77 |
|   NULL |
|  19.39 |
|   3.82 |
|   3.38 |
|   NULL |
| -14.21 |
|   6.75 |
|   NULL |
+--------+

2.  Issue the query

mysql> select distinct if((num4*7 >= 0) AND (num4*7 < 256), CHAR(num4*7), NULL) as
z_char_num from Calcs;

This should produce the results:

+------------+
| z_char_num |
+------------+
| NULL       |
| K          |
| :          |
| J          |
| !          |
| ?          |
|           |
|           |
| /          |
+------------+
9 rows in set (0.06 sec)

3. Now change to utf8 for communication:

mysql>  SET NAMES utf8;

4. Issue the same query as in Step 2.

Expected:  The same results as in Step 2.
Actual:

+------------+
| z_char_num |
+------------+
| NULL       |
|            |
+------------+
2 rows in set (0.00 sec)
[15 Sep 2005 19:02] Richard Wesley
The problem does not occur with constants:

mysql> select distinct char(10.85*7) from Calcs;

Produces 'L', as expected.

The same problem also happens with a Linux 4.1.10-standard server.
[16 Sep 2005 10:21] Vasily Kishkin
Thanks for bug report. I was able to reproduce the bug. On 4.1.14 I found the follow
strange results:

mysql> select distinct char(75.95) from Calcs;
+-------------+
| char(75.95) |
+-------------+
| L           |
+-------------+
1 row in set (0.00 sec)

mysql> select distinct char(10.85*7) from Calcs;
+---------------+
| char(10.85*7) |
+---------------+
| K             |
+---------------+
1 row in set (0.00 sec)

I checked on Windows 2003
[23 Sep 2005 9:56] Vasily Kishkin
I've checked on Linux SUSE 9.3. There is same result.
[10 Oct 2005 10:48] Alexander Barkov
An easier test demonstrating the same problem:

mysql> create table t1 (a int);
mysql> insert into t1 values (48),(49),(50);
mysql> set names utf8;
mysql> select distinct char(a) from t1;
+---------+
| char(a) |
+---------+
|         |
+---------+
1 row in set (0.00 sec)

The same work fine with latin1:

mysql> create table t1 (a int);
mysql> insert into t1 values (48),(49),(50);
mysql> set names latin1;
mysql> select distinct char(a) from t1;
+---------+
| char(a) |
+---------+
| 0       |
| 1       |
| 2       |
+---------+
3 rows in set (0.01 sec)
[10 Oct 2005 10:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/30852
[21 Oct 2005 14:10] Alexander Barkov
Fixed in 4.1.16 and 5.0.16
[23 Oct 2005 2:46] Paul DuBois
Noted in 4.1.16, 5.0.16 changelogs.