Bug #68557 sjis changes in 5.5 break date parsing
Submitted: 4 Mar 2013 8:12 Modified: 12 Nov 2019 5:21
Reporter: Hartmut Holzgraefe Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:mysql 5.5.30 OS:Any
Assigned to: CPU Architecture:Any

[4 Mar 2013 8:12] Hartmut Holzgraefe
Description:
sjis has changed quite a bit between 5.1 and 5.5 (implementation changed from ~4K lines of code to ~35K)

One of the changes seems to be that '/' is no longer considered to be a punctuation character. As String to DATE conversion allows for any punctuation character to be used as separator between year, month and day instead of the default '-'.

This leads to code using date strings like '2013/02/28' which worked fine with sjis in 5.1 to stop working in 5.5

 The manual does just say 

  A “relaxed” syntax is permitted: Any punctuation character 
  may be used as the delimiter between date parts. 
  For example, '2012-12-31', '2012/12/31', '2012^12^31', 
  and '2012@12@31' are equivalent. 

  < http://dev.mysql.com/doc/refman/5.5/en/date-and-time-literals.html#date-and-time-string-nu... >

but does not mention that what qualifies as punctuation 
character actually seems to be charset/collation specific
so that code relying on one of the "relaxed" forms may
break when changing to a different charset/collation

How to repeat:
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (d DATE);
INSERT INTO t1 VALUES('2013-02-28');

SET NAMES utf8;
SELECT COUNT(*) FROM t1 WHERE d = '2013/02/28';

SET NAMES sjis;
SELECT COUNT(*) FROM t1 WHERE d = '2013/02/28';

COUNT() result is 1 in both cases on 5.1, but 0 for the 2nd query on 5.5

Suggested fix:
Either

* check for default US ASCII (or latin1) punctuation characters only instead of checking against the current connection collation/charset

=> maximum portability of the "relaxed" format

or

* check against both the default ASCII/latin set of punctuation characters and those that are defined by the current charset/collation

or

* check for charset/collation punctuation chars plus a few hard coded ones like '/' that are in common use with date strings

or

* make '/' a punctuation character in sjis even though at least http://en.wikipedia.org/wiki/Japanese_punctuation says it isn't
[4 Mar 2013 9:53] MySQL Verification Team
Hello Hartmut,

Thank you for the report.
Verified as described.

Thanks,
Umesh
[12 Nov 2019 5:21] Chandan Kunal
Posted by developer:
 
The main source of the problem is allowing sloppy syntax in the first place. If we had stuck to strict SQL (and ISO8601) compliance, this bug would never have occurred. This is also why we consider deprecating sloppy syntax in e.g date/time literals. 
Fixing this bug will take us in the wrong direction also sjis and ujis are broken in various ways..

CAST function can be used as workaround for this bug. For example query mentioned in the bug page can be reworked as follows:
          SELECT * FROM t1 WHERE d = CAST('2013/02/28' AS datetime);