Bug #45012 my_like_range_cp932 generates invalid string
Submitted: 21 May 19:06 Modified: 22 Jun 14:01
Reporter: Sergei Golubchik
Status: In progress
Category:Server: Charsets Severity:S2 (Serious)
Version:4.1, 5.0, 5,1, 6.0 bzr OS:Any
Assigned to: Ramil Kalimullin Target Version:5.1+
Triage: Triaged: D2 (Serious)

[21 May 19:06] Sergei Golubchik
Description:
my_like_range_cp932() uses 0xFF as a max_sort_char, but it's not a valid cp932 character.
As a result my_like_range_cp932 generates invalid cp932 string, which is then passed down
to a storage engine in records_in_range().

In particular, DB2 engine doesn't like invalid strings.
See http://lists.mysql.com/internals/36709

How to repeat:
.
[22 May 10:53] Sveta Smirnova
Thank you for the report.

Verified as described.
[22 May 16:00] Tim Clark
Additionally, the following character sets appear to have a similar problem:
big5
euckr
gb2312
gbk
sjis
tis620
ujis
[22 May 22:19] Tim Clark
I wanted to note that this problem is visible with MyISAM also, via a warning:

>CREATE TABLE t1 (c char(10), v varchar(20), index(c), index(v)) collate big5_chinese_ci
engine=myisam;
>insert into t1 values ("abc","def"),("abcd", "def"),("abcde","defg"),("aaaa","bbbb");
>select * from t1 force index(v) where v like "x%";
Empty set, 1 warning (0.02 sec)
>show warnings;
+---------+------+-------------------------------------------------------------------------------+
| Level   | Code | Message                                                               
       |
+---------+------+-------------------------------------------------------------------------------+
| Warning | 1366 | Incorrect string value: '\xFF\xFF\xFF\xFF\xFF\xFF...' for column 'v'
at row 1 |
+---------+------+-------------------------------------------------------------------------------+
[1 Jun 23:10] Tim Clark
Not sure whether this should be put in a different bug report, but I believe that the
max_sort_char (0xFFFF) for the ucs2_* collations is also invalid according to the Unicode
specification (max valid is 0xFFFD).