Description:
Here is the case:
create table t1 (ch varchar(1),name varchar(64))character set latin2 collate latin2_czech_cs;
insert into t1 values (0x4F,'LATIN CAPITAL LETTER O');
select hex(weight_string(ch)), name from t1;
output:
+------------------------+------------------------+
| hex(weight_string(ch)) | name |
+------------------------+------------------------+
| 140127014D014D00 | LATIN CAPITAL LETTER O |
+------------------------+------------------------+
but when you add distinct around hex fucntion:
select distinct(hex(weight_string(ch))) w, name from t1;
output:
+------+------------------------+
| w | name |
+------+------------------------+
| 14 | LATIN CAPITAL LETTER O |
+------+------------------------+
field w is just prefix of the real result.
That is because when you add distinct, a tmp table will create, this function's result will save into result_field, but the result_field's length is shorter than expected.
How to repeat:
see Description
Suggested fix:
in item_strfunc.cc
function bool Item_func_weight_string::resolve_type(THD *);
change this line:
set_data_type_string(
field ? field->pack_length()
: result_length ? result_length
: cs->mbmaxlen * max(args[0]->max_char_length(),
num_codepoints));
to:
set_data_type_string(
field ? field->pack_length()
: result_length
? result_length
: (uint32)cs->coll->strnxfrmlen(
cs, cs->mbmaxlen *
max<size_t>(args[0]->max_char_length(), num_codepoints)));
Do you forget to warp strnxfrmlen function around?
this max_length will always equals to args[0] 's max_char_length * cs->mbmaxlen, which is shorter than result converted by weight_string