Bug #69869 | Spaces aren't spaces | ||
---|---|---|---|
Submitted: | 30 Jul 2013 7:32 | Modified: | 31 Jul 2013 8:55 |
Reporter: | Owen Miller | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Parser | Severity: | S4 (Feature request) |
Version: | 5.7.2 | OS: | Linux (Ubuntu 12.04) |
Assigned to: | CPU Architecture: | Any | |
Tags: | , 1064, space, Unicode |
[30 Jul 2013 7:32]
Owen Miller
[30 Jul 2013 15:08]
MySQL Verification Team
I think this bug report is about the parser. Previously I opened similar one internally about actual data. Bug 14271638 - CHARACTER SET DUPLICATE HANDLING SEEMS BROKEN The crux of the matter was some character codes are different, but appear the same so the only sure way to represent all varchar is in hex notation otherwise dump/reload with mysqldump can corrupt or give duplicate errors, e.g: mysql> select hex(a),a from t1; +--------+------+ | hex(a) | a | +--------+------+ | 29 | ) | | A4 | ) | +--------+------+ 2 rows in set (0.00 sec)
[31 Jul 2013 6:17]
Owen Miller
yeah sorry, I should've said that this was in the parser, not the data tables. It was where I was writing my SELECT statements. I'm not quite sure what you mean about the hex representation, but these are indeed different characters. It's only from a human point of view that they're the same.
[31 Jul 2013 8:55]
MySQL Verification Team
I believe currently the function my_isspace determines whether given char is whitespace or not. That is defined as: #define my_isspace(s, c) (((s)->ctype+1)[(uchar) (c)] & _MY_SPC) #define _MY_SPC 010 /* Spacing character */ In unicode there are more whitespace characters for example: http://en.wikipedia.org/wiki/Whitespace_character