MySQL Bugs: #68487: MDL hash can still be concurrency bottleneck

Bug #68487	MDL hash can still be concurrency bottleneck
Submitted:	25 Feb 2013 16:40	Modified:	2 Oct 2013 16:49
Reporter:	Dmitry Lenev	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Locking	Severity:	S3 (Non-critical)
Version:	5.6.10	OS:	Any
Assigned to:	Dmitry Lenev	CPU Architecture:	Any

Description:
This is a follow-up for bug #66473 "MySQL 5.6.6-m9 has more mutex contention than MySQL 5.1".

It turns out that under certain conditions MDL hash and associated locks can still be concurrency bottleneck for load produced by sysbench running against 8 tables doing simple PK look ups.

See comments from MarkC starting from 17th of February in bug #66473 for some data.

Investigation shows that partitioning of MDL hash doesn't work well for some of combination of database and table names. For example in the above bug report all 8 tables from sequence test.sbtest1 .. test.sbtest8 fall into the same MDL hash partition out of 8 default. Increasing number of hash partitions to 256 helps only partially in this specific case. While this case is rather extreme example of the problem the same issue in more limited form is also repeatable  for other sequences.

Analysis show that there is a problem with hash function which we use to partition MDL hash container. Its results are not uniformly spread for lower bits of hash which is used in partitioning, which causes too many collisions.

This problem can be addressed either by improving hashing function or changing approach to how we use this function for partitioning of MDL hash.

How to repeat:
See bug #66473 for detailed how-to-repeat instructions.

the same hash function maps all databases to a single worker thread in MTS

For the record.

5.7.1 release notes had this bit -

Performance: String hashing overhead was reduced. This also improves performance for metadata locking, the table definition cache, and Performance Schema table I/O and file I/O instrumentation. (Bug #13944392),

so I looked at the patch, which I believe is at 5107.1.537. The hash function algorithms haven't changed, no change for this bug. 

The only thing changed is to load the input/output arg into a local temp before calculations, so that the (colliding) hashes are computed faster. I wonder if the same effect could have been achieved by just adding restrict keyword to the pointer args.

The Support team escalated this.

The recommended setting for high performance servers and benchmarking is metadata_locks_hash_instances=256 instead of the default of 8. We'll review the default once we see how the improvements do. If they still leave enough cases where the 8 default could be problematic I'll ask for the default to be changed. Same if something blocks some other fix.

James Day, MySQL Senior Principal Support Engineer, Oracle

Noted in 5.6.15, 5.7.3 changelogs.

The hash function used for metadata locking was modified to reduce
overhead.

5.6$ bzr log -r 5459
------------------------------------------------------------
revno: 5459
committer: Dmitry Lenev <Dmitry.Lenev@oracle.com>
branch nick: mysql-5.6-16396598
timestamp: Wed 2013-09-18 13:03:53 +0400
message:
  Bug #16396598 "MDL HASH CAN STILL BE CONCURRENCY BOTTLENECK".
  
  All 8 tables which were used in a 8-table Sysbench run (named
  'test.sbtest1' .. 'test.sbtest8') fell into the same MDL hash
  partition (with metadata_locks_hash_instances set to default
  value - 8). This made the lock protecting this partition a
  bottleneck in some of the Sysbench tests.
  
  The MDL hash partition for the table is selected by taking modulo
  number-of-partitions from the hashed value of the key, which
  includes database and table name (e.g. '\002test\000sbtest1\000').
  The same hash value is later also used with the hash container
  representing the individual partition.
  It turned out that the my_hash_sort_bin hash function, which
  was used for this purpose, is fairly likely to produce hash
  values for keys looking like "key1" .. "keyN" which don't
  differ in the lower bits. This means that objects with such keys
  are fairly likely to fall into the same MDL hash partition.
  
  To solve this problem this patch changes the hash function used
  for both MDL hash partition selection and in the hash container
  representing the individual partition to MurmurHash3.
  
  MurmurHash3 is a modern non-cryptographic hash function with
  much better statistical properties than my_hash_sort_bin.
  Particularly, due to good avalanche effect, keys looking like
  "key1" .. "key8" are unlikely to have hashes with the same
  lower bits, so it is unlikely for tables which are named in
  a similar fashion to fall into the same MDL hash partition
  (e.g. in the specific case mentioned in the original bug report
  only 3 keys out of 8 fall in the same partition).
  
  Also MurmurHash3 is generally faster than my_hash_sort_bin.
  For example, on my machine, the 32-bit version of MurmurHash3
  used in this patch is around 3 times faster than my_hash_sort_bin
  for short, 16-byte keys and up to 5 times faster for really
  long keys.
  
  To be able to use MurmurHash3 with a HASH container the
  implementation of the container was extended to support usage
  of custom hash function instead of one prescribed by the character
  set.

If affected by this please see:

1. http://mysqlserverteam.com/removing-scalability-bottlenecks-in-the-metadata-locking-and-th... describing how the issues were quite thoroughly removed from 5.7.5 DMR onwards