Bug #68557 sjis changes in 5.5 break date parsing
Submitted: 4 Mar 2013 8:12 Modified: 9 Jan 2015 22:04
Reporter: Hartmut Holzgraefe Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:mysql 5.5.30 OS:Any
Assigned to: CPU Architecture:Any
Triage: Needs Triage: D3 (Medium)

[4 Mar 2013 8:12] Hartmut Holzgraefe
Description:
sjis has changed quite a bit between 5.1 and 5.5 (implementation changed from ~4K lines of code to ~35K)

One of the changes seems to be that '/' is no longer considered to be a punctuation character. As String to DATE conversion allows for any punctuation character to be used as separator between year, month and day instead of the default '-'.

This leads to code using date strings like '2013/02/28' which worked fine with sjis in 5.1 to stop working in 5.5

 The manual does just say 

  A “relaxed” syntax is permitted: Any punctuation character 
  may be used as the delimiter between date parts. 
  For example, '2012-12-31', '2012/12/31', '2012^12^31', 
  and '2012@12@31' are equivalent. 

  < http://dev.mysql.com/doc/refman/5.5/en/date-and-time-literals.html#date-and-time-string-nu... >

but does not mention that what qualifies as punctuation 
character actually seems to be charset/collation specific
so that code relying on one of the "relaxed" forms may
break when changing to a different charset/collation

How to repeat:
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (d DATE);
INSERT INTO t1 VALUES('2013-02-28');

SET NAMES utf8;
SELECT COUNT(*) FROM t1 WHERE d = '2013/02/28';

SET NAMES sjis;
SELECT COUNT(*) FROM t1 WHERE d = '2013/02/28';

COUNT() result is 1 in both cases on 5.1, but 0 for the 2nd query on 5.5

Suggested fix:
Either

* check for default US ASCII (or latin1) punctuation characters only instead of checking against the current connection collation/charset

=> maximum portability of the "relaxed" format

or

* check against both the default ASCII/latin set of punctuation characters and those that are defined by the current charset/collation

or

* check for charset/collation punctuation chars plus a few hard coded ones like '/' that are in common use with date strings

or

* make '/' a punctuation character in sjis even though at least http://en.wikipedia.org/wiki/Japanese_punctuation says it isn't
[4 Mar 2013 9:53] Umesh Shastry
Hello Hartmut,

Thank you for the report.
Verified as described.

Thanks,
Umesh