Bug #65727 Unicode minus sign should be accepted as minus-operator
Submitted: 25 Jun 2012 12:24 Modified: 28 Jun 2012 19:38
Reporter: Peter Laursen (Basic Quality Contributor) Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: Paul Dubois CPU Architecture:Any
Tags: qc
Triage: Needs Triage: D4 (Minor)

[25 Jun 2012 12:24] Peter Laursen
Description:
(not sure about the category here!) 

references: 
http://www.fileformat.info/info/unicode/char/2212/index.htm
http://en.wikipedia.org/wiki/Plus_and_minus_signs

The latter says (but this is of course not law!): "The hyphen-minus sign (-) is the ASCII version of the minus sign, and doubles as a hyphen. It is usually shorter in length than the plus sign and sometimes at a different height. It can be used as a substitute for the true minus sign when the character set is limited to ASCII."

(also inspired by http://datacharmer.blogspot.dk/2012/06/hidden-mistake.html).

How to repeat:
SET NAMES utf8;
SELECT (7 − 2);  -- 'unicode minus' ; returns 1064 syntax error
SELECT (7 - 2);  -- 'hyphen-minus'  ; returns "5"

Suggested fix:
As all charsets involved here are Unicode, I think the Unicode minus sign should be a valid minus operator? If not, I think docs should clearly state that it is not so.

(but don't change the behavior of hyphen, please!)
[25 Jun 2012 13:02] Peter Laursen
OK - understood that the Unicode minus is primarily intended for typsesetting etc. Programming languages will (still mostly) use ASCII. So this is probably just a consequence of MySQL being a C-derivate.

So changing category to docs where 
http://dev.mysql.com/doc/refman/5.5/en/arithmetic-functions.html#operator_minus
and 
http://dev.mysql.com/doc/refman/5.5/en/arithmetic-functions.html#operator_unary-minus
.. could have a clarification. As I understand Shlomi's comment in Guiseppes blog it seems that CSV files with negative numbers will sometimes (when generated recent Excel versions?) use Unicode minus and LOAD DATA would then fail.
[28 Jun 2012 19:38] Paul Dubois
PostgreSQL and GCC don't accept Unicode minus, either. MySQL is not unusual in this regard.
[28 Jun 2012 19:38] Paul Dubois
PostgreSQL and GCC don't accept Unicode minus, either. MySQL is not unusual in this regard.