Description:
Here is the case:
create table t1 (ch varchar(1),name varchar(64))character set latin2 collate latin2_czech_cs;
insert into t1 values (0x4F,'LATIN CAPITAL LETTER O');
select hex(weight_string(ch)), name from t1;
output:
+------------------------+------------------------+
| hex(weight_string(ch)) | name |
+------------------------+------------------------+
| 140127014D014D00 | LATIN CAPITAL LETTER O |
+------------------------+------------------------+
but when you add distinct around hex fucntion:
select distinct(hex(weight_string(ch))) w, name from t1;
output:
+------+------------------------+
| w | name |
+------+------------------------+
| 14 | LATIN CAPITAL LETTER O |
+------+------------------------+
field w is just prefix of the real result.
That is because when you add distinct, a tmp table will create, this function's result will save into result_field, but the result_field's length is shorter than expected.
How to repeat:
see Description
Suggested fix:
in item_strfunc.cc
function bool Item_func_weight_string::resolve_type(THD *);
change this line:
set_data_type_string(
field ? field->pack_length()
: result_length ? result_length
: cs->mbmaxlen * max(args[0]->max_char_length(),
num_codepoints));
to:
set_data_type_string(
field ? field->pack_length()
: result_length
? result_length
: (uint32)cs->coll->strnxfrmlen(
cs, cs->mbmaxlen *
max<size_t>(args[0]->max_char_length(), num_codepoints)));
Do you forget to warp strnxfrmlen function around?
this max_length will always equals to args[0] 's max_char_length * cs->mbmaxlen, which is shorter than result converted by weight_string
Description: Here is the case: create table t1 (ch varchar(1),name varchar(64))character set latin2 collate latin2_czech_cs; insert into t1 values (0x4F,'LATIN CAPITAL LETTER O'); select hex(weight_string(ch)), name from t1; output: +------------------------+------------------------+ | hex(weight_string(ch)) | name | +------------------------+------------------------+ | 140127014D014D00 | LATIN CAPITAL LETTER O | +------------------------+------------------------+ but when you add distinct around hex fucntion: select distinct(hex(weight_string(ch))) w, name from t1; output: +------+------------------------+ | w | name | +------+------------------------+ | 14 | LATIN CAPITAL LETTER O | +------+------------------------+ field w is just prefix of the real result. That is because when you add distinct, a tmp table will create, this function's result will save into result_field, but the result_field's length is shorter than expected. How to repeat: see Description Suggested fix: in item_strfunc.cc function bool Item_func_weight_string::resolve_type(THD *); change this line: set_data_type_string( field ? field->pack_length() : result_length ? result_length : cs->mbmaxlen * max(args[0]->max_char_length(), num_codepoints)); to: set_data_type_string( field ? field->pack_length() : result_length ? result_length : (uint32)cs->coll->strnxfrmlen( cs, cs->mbmaxlen * max<size_t>(args[0]->max_char_length(), num_codepoints))); Do you forget to warp strnxfrmlen function around? this max_length will always equals to args[0] 's max_char_length * cs->mbmaxlen, which is shorter than result converted by weight_string