Bug #58037 LOCK_open contention from the table cache
Submitted: 7 Nov 2010 17:17 Modified: 10 Nov 2010 20:24
Reporter: Mark Callaghan Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: General Severity:S5 (Performance)
Version:5.1.50 OS:Any
Assigned to: CPU Architecture:Any
Tags: concurrency, high, LOCK_open
Triage: Needs Triage: D3 (Medium) / R3 (Medium) / E3 (Medium)

[7 Nov 2010 17:17] Mark Callaghan
Description:
This is from Domas and Poor Man's Profiler (PMP). While he measured this on sysbench this is likely an intermittent problem for us in production.

The major reason mysql perf drops at high concurrencies in sysbench is the fact that it has to skip all the in_use 'sbtest' tables in the cache, so all these symbols are costly and are mostly taken under LOCK_open:

my_strnncoll_binary
table_cache_key
open_table loop:

How to repeat:
Run sysbench oltp read-only with high concurrency (128 to 1024 clients) on an 8+ core server
Use PMP to diagnose stalls

Suggested fix:
functions themselves are quite cheap, it is the loop that is not :(
I wonder if having unused table list maintained somewhere would work :-)

 /*
   Actually try to find the table in the open_cache.
   The cache may contain several "TABLE" instances for the same
   physical table. The instances that are currently "in use" by
   some thread have their "in_use" member != NULL.
   There is no good reason for having more than one entry in the
   hash for the same physical table, except that we use this as
   an implicit "pending locks queue" - see
   wait_for_locked_table_names for details.
 */
 for (table= (TABLE*) hash_first(&open_cache, (uchar*) key, key_length,
                                 &state);
      table && table->in_use ;
      table= (TABLE*) hash_next(&open_cache, (uchar*) key, key_length,
                                &state))
[7 Nov 2010 17:32] Valeriy Kravchuk
Please, send the results of:

show global variables like 'table%cache';
[7 Nov 2010 21:41] Domas Mituzas
mysql> show variables like 'table%cache%';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| table_definition_cache | 60000 |
| table_open_cache       | 50000 |
+------------------------+-------+
2 rows in set (0.00 sec)

Do note, in my tests hundreds of threads fetch same table - so they all get a table cache entry for same table.
open_table has to discard most of hash_next() results on table cache, as the entries atop the hash will have in_use=1.
Changing thread_cache size between values above the thread count doesn't change anything.
[7 Nov 2010 21:47] Domas Mituzas
also, opreport data during the test, do note top symbols are all within LOCK_open scope

samples  %        image name               app name                 symbol name
103767   36.9858  no-vmlinux               no-vmlinux               /no-vmlinux
12895     4.5962  mysqld                   mysqld                   my_strnncoll_binary
11451     4.0815  mysqld                   mysqld                   open_table(THD*, TABLE_LIST*, st_mem_root*, bool*, unsigned int)
11085     3.9510  mysqld                   mysqld                   table_cache_key
6672      2.3781  libpthread-2.5.so        libpthread-2.5.so        pthread_mutex_lock
5628      2.0060  mysqld                   mysqld                   my_hash_next
[7 Nov 2010 22:16] Davi Arnaut
FWIW, I think this was fixed in 5.5. The table object is taken from a free list in the table share.
[8 Nov 2010 0:47] Alexey Kishkin
> Run sysbench oltp read-only with high concurrency (128 to 1024 clients) on an 8+ core server

Why 8+ cores? Shouldn't be this problem visible with any core number? Even if 1 core?
[8 Nov 2010 22:41] Alexey Stroganov
Some preliminary results from my side:

I've run sysbench/OLTP_RO for innodb_plugin from 5.1.50 on box that has 24 Cores with following settings:

1) table_open_cache=2048 and table_definition_cache=256(default)
2) table_open_cache=50000 and table_definition_cache=256(default)
3) table_open_cache=50000 and table_definition_cache=60000

#
# Test: OLTP_RO:throughput, TPS
#
# Server 1 - 5.1.50 case 1) tc=2048  and tdc=256
# Server 2 - 5.1.50 case 2) tc=50000 and tdc=256
# Server 3 - 5.1.50 case 3) tc=50000 and tdc=60000
#
#             Server 1  Server 2  Server 3
# Thread        INNODB    INNODB    INNODB
          4    1701.49   1536.35   1571.11
         16    5069.76   5139.59   5122.13
         64    7053.26   7012.83   7000.24
        128    6742.85   6739.12   6732.86
        256    6161.54   6096.30   6170.09
        512    4897.48   4935.99   4927.27
       1024    2182.22   2221.92   2160.82

So on this platform I see no drops for InnoDB plugin(going to check builtin InnoDB and MyISAM as well)

Also oprofile report looks a bit differently in my case:
(report from run with 1024 threads and tc=50000 and tdc=60000)

samples  %        image name               symbol name
123642   18.4222  mysqld                   my_pthread_fastmutex_lock
43394     6.4655  mysqld                   table_cache_key
38562     5.7456  mysqld                   my_strnncoll_binary
37409     5.5738  mysqld                   my_hash_sort_simple
26692     3.9770  mysqld                   ptr_compare_0
24647     3.6723  mysqld                   my_hash_next
23987     3.5740  mysqld                   open_table(THD*, TABLE_LIST*, st_mem_root*, bool*, unsigned int)
17841     2.6582  mysqld                   my_lengthsp_8bit
11826     1.7620  mysqld                   my_strnxfrm_simple

I plan to recheck this issue on another box with 8 Cores and will add comment with my findings.
[8 Nov 2010 23:54] Domas Mituzas
Alexey, 

your test shows exactly what we wanted to see here. 

You have high cost symbols that are held under LOCK_open, and performance dropped in all cases.

This problem does not depend on table cache size, this problem is because of table cache (though without it you'd see even worse results ;-)
[9 Nov 2010 0:43] Alexey Stroganov
Domas,

Ok, it seems that I was somehow confused with high values for table cache. From other side it's good that I confirmed issue as well :).

re: drop at high threads in OLTP_RO - it seems that this issue was already fixed in 5.5 branch:

#
# Test: OLTP_RO:throughput, TPS
#
# Server 1 - 5.5.7
#
#             Server 1  
# Thread        INNODB  
          4    1601.20  
         16    5066.10  
         32    7131.79  
         64    7326.71  
        128    7028.15  
        256    6903.40  
        384    6710.45  
        512    6575.19  
        768    6443.28  
       1024    6256.12  

oprofile report for run with 1024 threads: 
samples  %        image name               symbol name
363349   12.8666  mysqld                   row_search_for_mysql
262914    9.3101  mysqld                   ut_delay
82289     2.9139  mysqld                   ptr_compare_0
75490     2.6732  mysqld                   evaluate_join_record(JOIN*, st_join_table*, int)
53609     1.8984  mysqld                   read_view_open_now
44176     1.5643  mysqld                   rec_get_offsets_func
43613     1.5444  mysqld                   btr_search_guess_on_hash
43342     1.5348  mysqld                   btr_cur_search_to_nth_level
[9 Nov 2010 6:54] Valeriy Kravchuk
Based on the results Alexey presented above we can state that 5.1.x is affected, while 5.5.7 is not.
[10 Nov 2010 20:14] Omer Barnir
Issue is fixed in 5.5 and will not be back ported to 5.1
[10 Nov 2010 20:24] Mark Callaghan
nice work by the MDL implementors