| Bug #72993 | my_hash_sort_utf8 can be optimized if table has many fields | ||
|---|---|---|---|
| Submitted: | 13 Jun 2014 1:03 | Modified: | 11 Aug 2014 14:37 |
| Reporter: | Hao Liu | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server | Severity: | S3 (Non-critical) |
| Version: | 5.6 5.5 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[13 Jun 2014 1:12]
Hao Liu
attached patch. I think it can be optimized further.
Attachment: bug#72993.diff (application/octet-stream, text), 4.22 KiB.
[11 Aug 2014 14:37]
MySQL Verification Team
murmu's function is basically a hash function that's operating on a block of bytes. And that is all of its functionality. On the other hand, my_hash_sort_utf8() is doing a collation aware string hashing. So it's comparing apples to apple seeds. We strongly rely on collation-aware hashing as we have to support hundreds of different collations. But, thank you for trying to improve a performance of our server.

Description: For many OLTP workload, tables has many fields. MySQL uses my_hash_sort_utf8 as the hash function in the hash search of find_field_in_table function. We have a benchmark that the table has many fields( 54 fields one table), and have found the my_hash_sort_utf8 can be the hotspot and can be optimized. The pref report's output is below: + 6.67% mysqld_old mysqld_old [.] MYSQLparse(void*) ` - 2.98% mysqld_old mysqld_old [.] my_hash_sort_utf8 a - my_hash_sort_utf8 a - 97.93% cset_hash_sort_adapter a - 62.04% my_hash_first a - my_hash_search a - 93.22% find_field_in_table(THD*, TABLE*, char const*, unsigned int, bool, unsigned int*) a find_field_in_table_ref(THD*, TABLE_LIST*, char const*, unsigned int, char const*, char consta find_field_in_tables(THD*, Item_ident*, TABLE_LIST*, TABLE_LIST*, Item**, find_item_error_repa + Item_field::fix_fields(THD*, Item**) a + 6.78% find_native_function_builder(THD*, st_mysql_lex_string) a + 37.96% my_hash_first_from_hash_value a + 1.74% my_hash_first a + 2.45% mysqld_old mysqld_old [.] lex_one_token(void*, void*) a + 1.59% mysqld_old libc-2.12.so [.] __strlen_sse42 a + 1.58% mysqld_old mysqld_old [.] build_template_field(row_prebuilt_t*, dict_index_t*, dict_index_a + 1.49% mysqld_old mysqld_old [.] rec_get_offsets_func(unsigned char const*, dict_index_t const*, a + 1.46% mysqld_old libc-2.12.so [.] memcpy a I think we can use murmur3_32 hash function to replace my_hash_sort_utf8. I did a simple test: The test is read-only test and cpu-bound( using 4 cores). After use murmur32_32: QPS : 19600 => 20400. The my_hash_sort_utf8 hotspot of perf report's output disappears: my_hash_sort_utf8 uses 2.98% cpu vs murmur3_32 uses 0.7% cpu. How to repeat: read the code of find_field_in_table and my_hash_sort_utf8. Suggested fix: I'll attach a simple patch to prove the performance improvement.