MySQL Bugs: #98305: hp_rec_key_cmp suboptimal comparison

Bug #98305	hp_rec_key_cmp suboptimal comparison
Submitted:	21 Jan 2020 10:57	Modified:	29 Jan 2020 13:03
Reporter:	Georgiy Kirichenko	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Data Types	Severity:	S5 (Performance)
Version:	5.7, 8.0	OS:	Linux
Assigned to:		CPU Architecture:	Any

Description:
hp_rec_key_cmp performs two string comparison in two phases: while the first one it searches
an octet length of comparing strings and while the second one it does comparison using strnncollsp.
The main issue is that hp_rec_key_cmp decodes all the comparing strings completely whereas they could differ starting just from their first characters. Besides the fact that this issue has no performance impact in case of fixed length encoding, UTF-8 performance is suffering a lot.

How to repeat:
The easiest way is to run mysqld under a perf util and perform sysbench oltr ro test. Then look into the profiling results and found my_charpos_mb there.

Suggested fix:
My approach is to split both string into chunks of 4 characters and then compare such chunks one by one until a diffirence was found.

Hi Mr. Kirichenko,

Thank you for your performance enhancement request.

5.7 is in very late phase and all performance enhancement would go to 8.0. However, this function that you are writing about is already significantly changed in 8.0 and we are happy with its performance.

Last, but not least, MEMORY storage engine is being replaced by a more efficient one. So far only for intrinsic tables, but any further development will go into that direction.

Hi Mr. Milivojevic,

Thank you for you response.

Let me provide a more detailed explanation of the issue. I understand
that MySQL 5.7 is in a very late phase, and unlikely to receive any
performance fixes.

However, MySQL 8.0 is also affected, even though not with default
settings. As soon as the internal_tmp_mem_storage_engine variable is
changed from its default value 'TempTable' to 'MEMORY' (which is a
permitted and supported configuration change at the time of this
writing), MySQL 8.0 will have the same performance issue I described
originally.

Additionally, a user may want to create an explicit MEMORY table with
a hash index for whatever purposes in MySQL 8.0, which would still be
a supported functionaly that might benefit from the optimization
proposed here.

On top of that, even the TempTable storage engine has absolutely
the same performance issue. If you look at the
Cell_calculator::compare() function in MySQL 8.0, you'll note that the
entire strings to be compared are first fully scanned before they are
actually compared. Which is clearly suboptimal, especially if the
strings differ in the first few characters.

Finally, let me explain the optimization with an example. Suppose we
should compare two UTF-8 strings "a"*50 times and "b"*50 times. The
current inefficient way to do it (regardless of the server version or
the storage engine), due to the collation API, is evaluating the octet
length of both strings using my_charpos() and then using strncollsp()
and passing the previusly calculated octet lengths. Obviously, in this
specific case, decoding both strings until their ends is inefficient
because comparison will stop on the first character.

One way to optimize that would be decoding and comparing multi-byte
strings in some fixed-length chunks of 1 or more characters.

A better approach would be extending the collation API with a function
that implements comparison logic in both MEMORY and TempTable storage
engines to avoid code duplication.

With best regards, 
Georgy.

Changing the status

Hi Mr. Kirichenko,

Thank you for your response ....

Our string comparison code stops at the first character, which differs from the compared string. This is the same logic everywhere. 

Also, scanning is necessary in order to determine whether the string conforms fully to the character set standard that it belongs to.

If you have a better idea on how to write all those string comparison functions, for each character set, you are welcome to send us a patch that will perform significantly better.

Sinisa, my dear friend!!!11

Let me explain it in even more detail than my colleague Georgy.

The snippet below is from the hp_rec_key_cmp() function in MySQL 8.0.19,
storage/heap/hp_hash.cc:

375     if (seg->type == HA_KEYTYPE_TEXT) {
...
381      if (cs->mbmaxlen > 1 && (seg->flag & HA_PART_KEY_SEG)) {
...
383        char_length1 = my_charpos(cs, pos1, pos1 + seg->length, char_length); // <-- scans the entire first argument
384        char_length1 = std::min(char_length1, size_t(seg->length));
385        char_length2 = my_charpos(cs, pos2, pos2 + seg->length, char_length); // <-- scans the entire second argument
386        char_length2 = std::min(char_length2, size_t(seg->length));
...
403      if (cs->coll->strnncollsp(cs, pos1, char_length1, pos2, char_length2)) // <-- compares arguments character-by-character
        return 1;

In fact, the above code path and comments hold for any multi-byte
encoding.

The only reason we do a full scan of both arguments before comparing
them is that strnncollsp() API call expects lengths of arguments in
characters, rather than octets. But converting octets to characters is
an expensive operation for multi-byte strings, and is also a high
price to pay if the arguments differ in the first few characters!

TempTable code has absolutely the same problem in
cell_calculator::compare(), storage/temptable/include/cell_calculator.h:

259  } else if (m_mode == Mode::CHARSET_AND_CHAR_LENGTH) {
260    lhs_length = std::min(
261        static_cast<size_t>(lhs_data_length),
262        my_charpos(m_cs, lhs_data, lhs_data + lhs_data_length, m_char_length));
263    rhs_length = std::min(
264        static_cast<size_t>(rhs_data_length),
265        my_charpos(m_cs, rhs_data, rhs_data + rhs_data_length, m_char_length));
...
280  return m_cs->coll->strnncollsp(m_cs, lhs_data, lhs_length, rhs_data,
281                                 rhs_length);

There's certainly some room for optimization here, right? That's what
this bug report is about.

There may be multiple ways to optimize that. Georgy has outlined
possible approaches in his previous comment. Even with the most
straightforward approach we see about 15% speedup in CPU-bound sysbench
OLTP_RO.

My dear friend Kaamos,

There is no code in this world that could not be optimised further.

Next, we can't just do byte comparison, because you have some characters that are equal, although byte strings are different. Do I have to mention case and accent (and other)  insensitive comparisons ???? We had added code in those functions, in order to avoid some pitfalls and bugs. We can not optimise code and cause some old bugs creeping in.

However, as this is a very low priority request, I will verify it.

This report would get more weight if there would be a patch that would increase performance, while not introducing any regression bugs.

Dear Sinisa,

Thank you for verifying this optimization request. We would love to
contribute the patch, but code contributions are a more complicated then
reporting a problem. I've been told even on the Oracle side it is
sometimes easier to deal with just a problem description rather than a
contributed patch.

But yes, the patch passes MTR of course, so hopefully we are not
reintroducing any old bugs.

Despite the fact that MEMORY engine is now deprecated for 8.0 there are still Field_string::cmp function which uses excessive mb_charpos function.

So I would like to propose a patch which introduces two new functions for collation API and changes Field_string::cmp implementation as well as hp_hash.cc. 

I compared performance of two queries: select count(distinct c) from sbtest1 and select count(distinct pad) from sbtest1 using default sysbench data and have from 6x (for shorter pad column) up to 9.5x (for larger c) performance gain on my server:

After
mysql> select count(distinct c) from sbtest1;  
+-------------------+                          
| count(distinct c) |                          
+-------------------+                          
|           1000000 |                          
+-------------------+                          
1 row in set (2.24 sec)

Before:
mysql> select count(distinct c) from sbtest1;  
+-------------------+                          
| count(distinct c) |                          
+-------------------+                          
|           1000000 |                          
+-------------------+                          
1 row in set (21.44 sec)

Collation API change proposal

Attachment: 131.patch (application/octet-stream, text), 39.47 KiB.

Hi Mr. Kirchenko,

Thank you for your patch.

Let me just ask one question. As far as I can see, this is a patch for 8.0. Please, confirm.

Also, do all the tests from our MTR suite pass without problems ???

Last, but not least, we can not accept the patch, until you sign OCA agreement with us. Further information follow below.

Thank you very much for your patch contribution, we appreciate it!

In order for us to continue the process of reviewing your contribution to MySQL, please send us a signed copy of the Oracle Contributor Agreement (OCA) as outlined in http://www.oracle.com/technetwork/community/oca-486395.html

Signing an OCA needs to be done only once and it's valid for all other Oracle governed Open Source projects as well.

Getting a signed/approved OCA on file will help us facilitate your contribution - this one, and others in the future.  

Please let me know, if you have any questions.

Thank you for your interest in MySQL.

Thanks for your answer!

I rebased the patch against latest 8.0 master and then MTR passed fine.

Also I thought that I already have an OCA signed and have an allowance to submit code to Oracle. If you don't have my OCA signed I could sign it again in 2-3 days.

Please found new version attached bellow.

Rebased patch

Attachment: 131.patch (application/octet-stream, text), 38.76 KiB.

Hi Mr. Kirichenko,

We would be grateful if you would sign the OCA agreement, while we shall check whether there is any such agreement with you or the company that you work for.

Thanks again !!!

Hi Georgiy,

We would like, very much, to use your patch. However, we are not able to do that, so far .....

You either have to submit your patch, by using your company account, or you can obtain the OCA agreement by using your GMail account.

Please, choose one of the two choices.

Thanks in advance.

Reuploading the patch using company account

Attachment: 131.patch (application/octet-stream, text), 38.76 KiB.

I logged out and then logged in using my Huawei company account `georgy.kirichenko@huawei.com` and then uploaded the patch one time again. This instance of the patch is signed by `Georgy Kirichenko` instead of my `GeorgIy Kirichenko` gmail account which I used when created this issue.

I hope this would be Ok, if not then I would ask someone of my team to assist me with submitting the patch.

Thank you, Mr. Kirichenko,

We shall enquire whether this will do the trick and we will let you know.

Hi Mr. Kirichenko,

These legal matters are complex for everybody.

Anyway, we found out that you should submit it in the contribution tab *where you will confirm that its under OCA. Simple attachment, unfortunately, does not do the job.

Thank you !!!!

Thank you, Mr. Kirichenko.