Description:
Before Field_string::cmp actually compares two strings it decodes both strings length in bytes using my_charpos what could be relatively expensive in case of variable-length encodings like UTF8. However, in case if both string have a difference in their heads, it is useless to decode tails of the strings.
For instance if we have two strings like 'aaaaaaa..a'(100 characters in length) and 'bbbb..b'(100 characters in length) then comparison could stop immediately after comparing the first 'a' and the first 'b' without decoding consequence 99*2 characters.
My proposal is to virtually split string into small 8-character chunks and compare chunk by chunk until first difference found. According to my benchmarking of query like `select count(distinct c) from sbtest1;` using standard sysbench dataset there is up to 4x speedup.
How to repeat:
Initialize MySQL with sysbench standard dataset and then execute queries like
`select count(distinct c) from sbtest1;` and compare results for patched and unpatched versions.
Suggested fix:
Contribution is attached