Bug #17939 Wrong table format when using UTF8 strings
Submitted: 5 Mar 2006 14:57 Modified: 14 Sep 2006 3:48
Reporter: Christian Hammers (Silver Quality Contributor) (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.0 OS:Linux (Debian GNU/Linux Sid)
Assigned to: Alexander Barkov CPU Architecture:Any

[5 Mar 2006 14:57] Christian Hammers
Description:
As reported in http://bugs.debian.org/355302 by Daniel van Eeden <daniel_e@dds.nl>:
------------------------

> > mysql> SELECT 'John Doe' as '__tañgè Ñãmé';
> > +-------------------+
> > | __tañgè Ñãmé |
> > +-------------------+
> > | John Doe          |
> > +-------------------+
> > 1 row in set (0.00 sec)

The column name is wrongly formatted. Most probably because in Unicode UTF-8 the site is not
equal to the number of bytes the string.

Further info from Daniel:

# grep character-set-server /etc/mysql/my.cnf
character-set-server    = utf8
# logout
$ grep default-character-set .my.cnf
default-character-set=utf8
$ echo ${LC_ALL}
en_US.utf8
$ locale -a | grep "en_US.utf8"
en_US.utf8
$ mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 9 to server version: 5.0.18-Debian_8-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> SHOW VARIABLES like 'character_set_%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
7 rows in set (0.01 sec)

How to repeat:
see above

Suggested fix:
In case we did not overlook any config option and this is really a bug maybe use wcwidth() etc instead of strlen().
[6 Mar 2006 11:46] Hartmut Holzgraefe
Verified on 5.0 and 5.1, (i think it is a duplicate but i can't find the original report)

SET NAMES utf8;
CREATE TABLE foo (`foobär` int);
INSERT INTO foo VALUES(1);
SELECT * FROM foo;

+---------+
| foobär |
+---------+
|       1 |
+---------+
1 row in set (0.01 sec)
[17 Apr 2006 7:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/5004
[5 Jul 2006 21:45] Timothy Smith
Hmm, I looked at this more, and I think the patch is incomplete.  It does fix the problem w/ column name, but the analogous problem exists with the column values as well.  Use a test case like:

drop table t1;
set names utf8;
create table t1 (fööbâr varchar(32));
insert into t1 values ('áéíóú');
select * from t1;

I get these results:

mysql> select * from t1;
+------------+
| fööbâr  |
+------------+
| áéíóú |
+------------+
1 row in set (0.00 sec)

I.e., field->maxlength is not reliable, and needs a ->numcells call to generate correct output.

Timothy
[19 Jul 2006 16:26] Timothy Smith
Alexander,

Thanks for clarifying that the client should be started with --default-character-set=utf8; with that, then the patch does work as described.

Regards,

Timothy
[2 Sep 2006 9:28] Timothy Smith
Merged to 5.0 (will be in 5.0.25)

TODO: add test case, and merge to 5.1
[12 Sep 2006 1:34] Paul DuBois
Noted in 4.1.12, 5.0.25 changelogs.

For table-format output, mysql did not always calculate columns
widths correctly for columns containing multi-byte characters in the
column name or contents.
[14 Sep 2006 3:48] Paul DuBois
Noted in 5.1.12 changelog.
[8 Sep 2009 13:04] Jean-Denis Muys
mysql> select * from test;
+-----------+---------+------------+------+-------------+
| nom       | prénom | télephone | âge | RKTimeStamp |
+-----------+---------+------------+------+-------------+
| Héllène | Davôs  | 1234567890 |   45 |           0 | 
+-----------+---------+------------+------+-------------+
1 row in set (0.00 sec)

mysql> describe test;
+-------------+--------------+------+-----+---------+-------+
| Field       | Type         | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| nom         | varchar(255) | NO   | PRI |         |       | 
| prénom     | varchar(255) | NO   |     |         |       | 
| télephone  | varchar(255) | NO   |     |         |       | 
| âge        | smallint(6)  | NO   |     | 0       |       | 
| RKTimeStamp | int(11)      | YES  |     | 0       |       | 
+-------------+--------------+------+-----+---------+-------+
5 rows in set (0.01 sec)

mysql>
[8 Sep 2009 13:05] Jean-Denis Muys
Oops sorry for the comment just above. I meant to mention that this bug still occurs for me with mySQL version 5.1.37 (on MacOS X)
[21 Sep 2009 9:23] Alexander Barkov
Please run mysql as follows:

mysql --default-character-set=utf8

then try these two queries.