| Bug #314 | GROUP BY forces functions to treat multy-byte charset values as single-byte | ||
|---|---|---|---|
| Submitted: | 22 Apr 2003 17:01 | Modified: | 26 Apr 2003 4:22 |
| Reporter: | Alexander Keremidarski | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server | Severity: | S3 (Non-critical) |
| Version: | 4.0 | OS: | |
| Assigned to: | Alexander Barkov | CPU Architecture: | Any |
[25 Apr 2003 9:09]
Michael Widenius
This is a display problem in the mysql command line client that is probably not likely to happen in a client that support ujis. The test case is also not repeatable as it doesn't contain all information to repeat it.
[25 Apr 2003 9:24]
Alexander Keremidarski
It is not display problem only. Here is variation of above test with using HEX() values: mysql> select HEX(c), length(c) as lc1, HEX(left(c,1)) as c1 from x order by c; +----------------------+------+------+ | HEX(c) | lc1 | c1 | +----------------------+------+------+ | 61 | 1 | 61 | | 61626364 | 4 | 61 | | 61757434686766 | 7 | 61 | | 62 | 1 | 62 | | 63 | 1 | 63 | | 64 | 1 | 64 | | A4A2 | 2 | A4A2 | | A4A2A4A4A4A6A4A8A4AA | 10 | A4A2 | | A4A2A4ABA4B5A4BFA4CA | 10 | A4A2 | | A4A4 | 2 | A4A4 | | A4A6 | 2 | A4A6 | | A4A8 | 2 | A4A8 | | A4AA | 2 | A4AA | +----------------------+------+------+ 13 rows in set (0.01 sec) mysql> select HEX(c), length(c) as lc1, HEX(left(c,1)) as c1 from x group by c order by c; +----------------------+------+------+ | HEX(c) | lc1 | c1 | +----------------------+------+------+ | 61 | 1 | 61 | | 61626364 | 4 | 61 | | 61757434686766 | 7 | 61 | | 62 | 1 | 62 | | 63 | 1 | 63 | | 64 | 1 | 64 | | A4A2 | 2 | A4 | | A4A2A4A4A4A6A4A8A4AA | 10 | A4 | | A4A2A4ABA4B5A4BFA4CA | 10 | A4 | | A4A4 | 2 | A4 | | A4A6 | 2 | A4 | | A4A8 | 2 | A4 | | A4AA | 2 | A4 | +----------------------+------+------+
[26 Apr 2003 4:22]
Michael Widenius
Thank you for your bug report. This issue has been fixed in the latest
development tree for that product. You can find more information about
accessing our development trees at
http://www.mysql.com/doc/en/Installing_source_tree.html
Fixed a bug concerning RIGHT, LEFT & MID when used with multi-byte-character-sets and GROUP BY

Description: Looks like when GROUP BY is used for some reason values are Displayed wrong assuming they are single byte while they contain correct multibyte content. How to repeat: Start mysqld with --default-character-set=ujis insert into x values ('あ'),('い'),('う'),('え'),('お'),('あいうえお'),('あかさたな'); /* Same as above insert into x values ('\xa4\xa2'),('\xa4\xa4'),('\xa4\xa6'),('\xa4\xa8'), ('\xa4\xaa'), ('\xa4\xa2\xa4\xa4\xa4\xa6\xa4\xa8\xa4\xaa'), ('\xa4\xa2\xa4\xab\xa4\xb5\xa4\xbf\xa4\xca'); */ insert into x values ('a'),('b'),('c'),('d'),('abcd'),('aut4hgf'); /* Observe result of following queries. Might be unreadable in HTML */ mysql> select c, length(c) as lc1, left(c,1) as c1, length(left(c,1)) as lc1, ord(left(c,1)), char(ord(left(c,1))) oc1 from x order by c; +------------+------+------+------+----------------+-----+ | c | lc1 | c1 | lc1 | ord(left(c,1)) | oc1 | +------------+------+------+------+----------------+-----+ | a | 1 | a | 1 | 97 | a | | abcd | 4 | a | 1 | 97 | a | | aut4hgf | 7 | a | 1 | 97 | a | | b | 1 | b | 1 | 98 | b | | c | 1 | c | 1 | 99 | c | | d | 1 | d | 1 | 100 | d | | あ | 2 | あ | 2 | 42146 | あ | | あいうえお | 10 | あ | 2 | 42146 | あ | | あかさたな | 10 | あ | 2 | 42146 | あ | | い | 2 | い | 2 | 42148 | い | | う | 2 | う | 2 | 42150 | う | | え | 2 | え | 2 | 42152 | え | | お | 2 | お | 2 | 42154 | お | +------------+------+------+------+----------------+-----+ 13 rows in set (0.00 sec) mysql> select c, length(c) as lc1, left(c,1) as c1, length(left(c,1)) as lc1, ord(left(c,1)), char(ord(left(c,1))) oc1 from x group by c order by c; +------------+------+------+------+----------------+-----+ | c | lc1 | c1 | lc1 | ord(left(c,1)) | oc1 | +------------+------+------+------+----------------+-----+ | a | 1 | a | 1 | 97 | a | | abcd | 4 | a | 1 | 97 | a | | aut4hgf | 7 | a | 1 | 97 | a | | b | 1 | b | 1 | 98 | b | | c | 1 | c | 1 | 99 | c | | d | 1 | d | 1 | 100 | d | | あ | 2 | | 2 | 42146 | | | あいうえお | 10 | | 2 | 42146 | | | あかさたな | 10 | | 2 | 42146 | | | い | 2 | | 2 | 42148 | | | う | 2 | | 2 | 42150 | | | え | 2 | | 2 | 42152 | | | お | 2 | | 2 | 42154 | | +------------+------+------+------+----------------+-----+ 13 rows in set (0.00 sec) mysql> select c, length(c) as lc, left(c,2) as c2, length(left(c,2)) as lc2, ord(left(c,2)), char(ord(left(c,2))) oc2 from x group by c order by c; +------------+------+------+------+----------------+-----+ | c | lc | c2 | lc2 | ord(left(c,2)) | oc2 | +------------+------+------+------+----------------+-----+ | a | 1 | a | 1 | 97 | a | | abcd | 4 | ab | 2 | 97 | a | | aut4hgf | 7 | au | 2 | 97 | a | | b | 1 | b | 1 | 98 | b | | c | 1 | c | 1 | 99 | c | | d | 1 | d | 1 | 100 | d | | あ | 2 | あ | 2 | 42146 | | | あいうえお | 10 | あ | 4 | 42146 | | | あかさたな | 10 | あ | 4 | 42146 | | | い | 2 | い | 2 | 42148 | | | う | 2 | う | 2 | 42150 | | | え | 2 | え | 2 | 42152 | | | お | 2 | お | 2 | 42154 | | +------------+------+------+------+----------------+-----+