Bug #68142 UCA / LDML parser does not complain about invalid/unsupported backslash sequence
Submitted: 22 Jan 2013 14:45 Modified: 22 Jan 2013 18:31
Reporter: Hartmut Holzgraefe Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:mysql-5.6.9 OS:Any
Assigned to: CPU Architecture:Any

[22 Jan 2013 14:45] Hartmut Holzgraefe
Description:
According to the documentation (and the code AFAICT) the only kind of backslash sequences supported/allowed are \u... sequences that specify a unicode code point by its hexadecimal code point number after the \u

Forgetting the 'u' after '\' does not raise any errors though ...

How to repeat:

  <collation name="utf8_test" id="253">
    <rules>
     <reset>A</reset>
     <p>\00c4</p><!-- should be Ä. missing 'u' here -->
     <t>\u00e4</t><!-- ä -->
    </rules>
  </collation>

Suggested fix:
Verify that a \ is followed by a valid/supported sequence ...
[22 Jan 2013 18:31] Sveta Smirnova
Thank you for the report.

Verified as described.

Quote of the documentation at http://dev.mysql.com/doc/refman/5.6/en/ldml-rules.html: "Characters named in LDML rules can be written literally or in \unnnn format, where nnnn is the hexadecimal Unicode code point value."