Bug #14131 | Report PAD attribute in SHOW COLLATION and INFORMATION_SCHEMA.COLLATIONS | ||
---|---|---|---|
Submitted: | 19 Oct 2005 6:06 | Modified: | 9 Oct 2008 12:14 |
Reporter: | Jan Lindström | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S4 (Feature request) |
Version: | 5.0, 5.1, 6.0 | OS: | Linux (any) |
Assigned to: | Assigned Account | CPU Architecture: | Any |
[19 Oct 2005 6:06]
Jan Lindström
[19 Oct 2005 9:22]
Jan Lindström
Changed synopsis because column prefix index on normal varchar fields causes also 'Duplicate key error': mysql> create table t1 (a int, b varchar(10), c varchar(10), filler varchar(10), primary key(a, b(2)),unique key (a, c(2))) engine = innodb; Query OK, 0 rows affected (0.01 sec) mysql> create table t2 (a int, b varchar(10), c varchar(10), filler varchar(10), primary key(a, b(2)),unique key (a, c(2))) engine = myisam; Query OK, 0 rows affected (0.00 sec) mysql> insert into t1 values(1,'c','c','c'); Query OK, 1 row affected (0.00 sec) mysql> insert into t2 values(1,'c','c','c'); Query OK, 1 row affected (0.00 sec) mysql> insert into t2 values(1,'c ','c ','c '); ERROR 1062 (23000): Duplicate entry '1-c ' for key 1 mysql> insert into t1 values(1,'c ','c ','c '); ERROR 1062 (23000): Duplicate entry '1-c ' for key 1 SQL 2003 standard says: The comparison of two character string expressions depends on collation used for the comparison. When values of unequal size are compared, if the collation of the comparison has the NOPAD characteristic and the sorter value is equal to some prefix of the longer value, then the sorter value is considered less than the longer value. If the collation for the comparison has the PADSPACE characteristic,for the purpose of comparison, the shorter values is effectively extended to the length of the longer by concatenation of <spcace>s on the right. Manual could state which character set collations have NOPAD or PADSPACE characteristics. In addition to this SHOW FULL COLUMNS FROM x could also print this information. Regards, Jan
[19 Oct 2005 13:24]
Valeriy Kravchuk
Thank you for a problem report. Verified on latest 5.0.15 (ChangeSet@1.2007.4.1, 2005-10-18 18:51:07-07:00, patg@krsna.patg.net) just as described in the last comment.
[21 Oct 2005 8:43]
Alexander Barkov
Jan, I'm sorry I didn't understand what's the problem? Everything seems to work as expected. All MySQL collations are "PAD SPACE collations". I think it is a good idea to add "PAD SPACE" reporting into "SHOW" output. Thanks for the idea! "SHOW COLLATION" seems to be the best place. We also can add missing PAD_ATTRIBUTE column into INFORMATION_SCHEMA.COLLATIONS, defined by the standard. I'm going to put this bug into "Documenting" status to let the doc team check whether something should be added into the manual about "PAD SPACE", and ask them to move this report into "Feature request" afterwards, for extending SHOW and INFORMATION_SCHEMA. What do you think? Thanks!
[21 Oct 2005 8:52]
Jan Lindström
IMHO if you have varchar field then 'c' != 'c ' because their length is different. This should be true also when comparing keys. i.e. create table t1(a varchar(2)) engine=innodb; insert into t1 values('c'),('c '); select * from t1 where a = 'c'; should return one row not two similarly select * from t1 where a = 'c '; and select * from t1 where a = 'c '; // this should not return any rows, now returns 2 Regards, Jan
[21 Oct 2005 9:30]
Alexander Barkov
According to the standard, comparison of CHAR/VARCHAR fields behaviour depends solely on collation PADSPACE/NOPAD attribute. Data type doesn't matter. All MySQL collations are PADSPACE. This is not a bug. Everything works as expected.
[21 Oct 2005 9:37]
Alexander Barkov
Dear documentation team: Please check if the manial mentions that MySQL collations are "PAD SPACE" collations, and thus trailing spaces in CHAR/VARCHAR do not matter during comparison. This is currently true for all collations. Note, for BINARY/VARBINARY datatypes trailing spaces do matter. In the future we'll possibly also have NOPAD collations for CHAR/VARCHAR. Afterwards, please change status of this bug to feature request. It is a good idea to report PAD attribute in SHOW COLLATION and INFORMATION_SCHEMA. Thanks!
[27 Oct 2005 8:49]
Jon Stephens
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant product(s). Additional info: Updated the CHAR/VARCHAR section of the Column Types chapter in the 4.1/5.0/5.1 versions of the Manual with this info after I verified the behaviour in 4.0.25, 5.0.15, and 5.1.2-bk. Changed status to S3/Fix Later. Please advise if this behaviour is not true for any version prior to 4.0 and I'll amend the 4.1 Manual accordingly. Thanks!