Bug #49177 MyISAM share list scalability problems
Submitted: 28 Nov 2009 8:16 Modified: 28 Aug 2015 13:47
Reporter: Sergei Golubchik Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S3 (Non-critical)
Version:5.1 OS:Any
Assigned to: Assigned Account CPU Architecture:Any

[28 Nov 2009 8:16] Sergei Golubchik
Description:
PeterZ claims that as table_cache becomes large misses becomes significantly more expensive (hits too, but not as much):

e.g.

> MySQL 5.1.40
>
> I started testing with default table_open_cache=64 and
> table_definition_cache=256 - the read took about 12 seconds very close to
> MySQL 5.0.85.
>
> As I increased table_definition_cache to 16384 result remained the same so
> this variable is not causing the bottleneck. However increasing
> table_open_cache to 16384 causes scan to take about 780 sec <...>
> So the problem is not fixed in MySQL 5.1

How to repeat:
see

http://www.mysqlperformanceblog.com/2009/11/16/table_cache-negative-scalability/
http://www.mysqlperformanceblog.com/2009/11/26/more-on-table_cache/
[30 Nov 2009 9:18] Sveta Smirnova
Thank you for the report.

Verified as described. Results from my machine:

table_open_cache=64:

real    11m33.918s
user    0m1.930s
sys     0m1.235s

table_open_cache=16384:

real    32m33.800s
user    0m1.043s
sys     0m0.657s
[30 Nov 2009 14:45] Alexey Stroganov
Below is my analysis of the issue:

According to oprofile in case when --table-open-cache=16384 we have test_if_reopen from storage/myisam/mi_open.c as busiest call:

samples  %        image name               symbol name
245163   61.0110  mysqld                   test_if_reopen
103073   25.6506  libc-2.4.so              strcmp
2734      0.6804  mysqld                   .plt
2606      0.6485  libc-2.4.so              .exit
1703      0.4238  mysqld                   MYSQLparse(void*)

Now compare above with oprofile report for run with --table-open-cache=64

samples  %        image name               symbol name
3876      2.6052  mysqld                   MYSQLparse(void*)
3728      2.5057  mysqld                   my_hash_sort_bin
3530      2.3727  mysqld                   .plt
2941      1.9768  vmlinux-2.6.16.46-0.12-smp clone_mnt
2673      1.7966  libc-2.4.so              __cyg_profile_func_exit
2514      1.6898  mysqld                   test_if_reopen
2115      1.4216  libc-2.4.so              _int_malloc
2051      1.3786  mysqld                   strmake

So let's look at test_if_reopen:

/******************************************************************************
** Return the shared struct if the table is already open.
** In MySQL the server will handle version issues.
******************************************************************************/

MI_INFO *test_if_reopen(char *filename)
{
  LIST *pos;

  for (pos=myisam_open_list ; pos ; pos=pos->next)
  {
    MI_INFO *info=(MI_INFO*) pos->data;
    MYISAM_SHARE *share=info->s;
    if (!strcmp(share->unique_file_name,filename) && share->last_version)
      return info;
  }
  return 0;
}

What is wrong here? I see here 2 problems: 

1) size of myisam_open_list - in case of open table it's limited by value of --table-open-cache parameter.
So it means that this list will grow up to <--table-open-cache> elements(16384 in above example) and for every new table we will rescan this list.

2) strcmp - if share->unique_file_name,filename will have long enough path comparing will be very expensive.
[30 Nov 2009 17:13] Alexey Stroganov
This issue is MyISAM specific. With such scenario when tables are accessed sequentially InnoDB is not affected:

--table-open-cache=64
perl bug49177.pl --select --db-engine=InnoDB --max-tables=64000
MySQL 5.1.41, ENGINE: InnoDB
Accessing tables
Time to select_from_MANY_tables (64000): 24.5946 wallclock secs 

--table-open-cache=64000
perl bug49177.pl --select --db-engine=InnoDB --max-tables=64000
MySQL 5.1.41, ENGINE: InnoDB
Accessing tables
Time to select_from_MANY_tables (64000): 25.4561 wallclock secs
[22 Apr 2010 11:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/106356

3007 Magne Mahre	2010-04-21
      Bug#49177 table_cache scalability problems
      
      (Note: MyISAM issue only)
      
      As the table cache size is increased, cache lookups and 
      (in particular) misses became increasingly more expensive.
      
      The problem was caused by the cache lookup mechanism, which
      was based on traversing a linked list, of <table cache size>
      length, comparing the file names. 
      
      A hash table has been introduced to reduce the lookup time.
      The linked list and the hash table are used in parallel, with
      the linked list used for implementing LRU functionality, and
      for operations that need to iterate through the entire cache.
      The hash table is only used for lookup.
[26 Jun 2010 9:33] Konstantin Osipov
A bug was found in the patch in process of self-review, putting back to "Verified".
[30 Sep 2010 9:36] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119489

3309 Magne Mahre	2010-09-30
      Bug#49177 MyISAM share list scalability problems
      
      (Note: MyISAM issue only)
            
      As the table cache size is increased, cache lookups and 
      (in particular) misses became increasingly more expensive.
            
      The problem was caused by the cache lookup mechanism, which
      was based on traversing a linked list, of <table cache size>
      length, comparing the file names. 
      
      As the list was replaced by a hash table, the lookup 
      time dropped significantly when used on a large table cache.
      The performance on smaller installations remains in the same
      ballpark.  Instead of growing exponentially, the lookup time
      now grows more or less linearly, at least in the ballpark
      with less than 100 000 open tables.
[21 Oct 2010 10:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/121507

3328 Magne Mahre	2010-10-21
      Bug#49177 MyISAM share list scalability problems
            
      (Note: MyISAM issue only)
                  
      As the number of open tables is increased, table lookup 
      (testing if a table is already open)  and  (in particular) 
      the case when a table is not open, became increasingly more 
      expensive.
                  
      The problem was caused by the open table lookup mechanism, 
      which was based on traversing a linked list comparing the 
      file names. 
            
      As the list was replaced by a hash table, the lookup 
      time dropped significantly when used on systems with
      a large number of open tables.
      
      The performance on smaller installations remains in the same
      ballpark.  Instead of growing exponentially, the lookup time
      now grows more or less linearly, at least in the ballpark
      with less than 100 000 open tables.
     @ storage/myisam/myisamchk.c
        Need to initialize MyISAM open table hash
     @ storage/myisam/myisampack.c
        Need to initialize MyISAM open table hash
[30 Jan 2012 11:23] Felix Ostmann
We patched MySQL 5.5.20 for our service and this was a big performance improve! We have ~250 Databases, each with ~2000 Tables in it (all MyISAM, ~35 GB total)!

After this patch, we cannot set table_open_cache = 200000 ! The server crashes with "mysqld got signal 6 ;" or out of memory (before patch, there was >10 GB system-cache). We currently run table_open_cache = 50000 and all is fine (only 2 GB system-cache). There is no memleak, simple that new hash-construct use many ram (not exactly proved).

Is there a plan to release a version soon, which help in our scenario? The list-approach dont work for us!
[27 May 2015 14:49] Mattias Jonsson
Posted by developer:
 
A new patch based on WL#4305 is proposed in rb#8811. (changeset not removed since it gives some history).
[10 Aug 2015 10:35] Murthy Sidagam
Posted by developer:
 
old rb#8811 is owned by Mattias and I cannot update the rb with new patch(which has fix for memory leak issue).
Hence I have uploaded new patch under rb#9874
[28 Aug 2015 13:47] Paul DuBois
Noted in 5.7.9, 5.8.0 changelogs.

As the number of open MyISAM tables increased, lookups to check
whether a table was open became expensive, particularly when the
table was not open. Lookup performance has been improved, with the
overhead reduction especially beneficial for selects on large number
of tables with large values of table_open_cache and
table_definition_cache.