Bug #20877 InnoDB data dictionary memory footprint is too big
Submitted: 6 Jul 2006 0:31 Modified: 13 Feb 2012 1:12
Reporter: Boris Burtin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S4 (Feature request)
Version:5.1 OS:Any (Linux)
Assigned to: Sunny Bains CPU Architecture:Any
Triage: Triaged: D5 (Feature request) / R4 (High) / E4 (High)

[6 Jul 2006 0:31] Boris Burtin
Description:
The InnoDB dictionary cache never frees data
and hence limits how many tables we are able to open during the lifetime
of the mysql server process. This results in us not being able to
deploy a system where a large number of users are provisioned,
but only a small set are active at any given time.

For instance, visiting 100,000 tables causes a dict_table_t whose heap
size is 22144 to be added to dict_sys in dict_table_add_to_cache().
That is 2GB of RAM!

The code to trim this cache is commented out - possibly because
the mem_fix ref count is never decremented in all the right places.

As we scale to larger deployments we need MySQL to be able to support a
larger number of tables without running out of memory.

How to repeat:
Access 100,000 tables without restarting MySQL.
[6 Jul 2006 19:03] Valeriy Kravchuk
Thank you for a problem report. In fact, for 100000 tables to be used efficiently 2GB of RAM is not so much. But I agree that some way to force cleaning of dictionary cache, without restarting server, might be useful. So, I mark this report as a verified feature request.
[12 Sep 2006 13:38] Marko Mäkelä
I have a patch that significantly reduces the size of dict_col_t and other structures by making use of bit fields and omitting unnecessary fields, such as type->prec and some magic numbers.

I may be able to reduce the sizes even further by replacing some pointers with inline data, such as replacing linked lists with arrays.
[21 Sep 2006 10:44] Timothy Smith
Patch queued to -maint team tree (expect to push to 5.1.12)
[3 Oct 2006 20:16] Chad MILLER
Available in 5.1.12-beta.
[6 Oct 2006 3:16] Paul Dubois
Noted in 5.1.12 changelog.

Memory consumption of the InnoDB data dictionary cache was roughly
halved by cleaning up the data structures.
[9 Jan 2007 7:29] Marko Mäkelä
The fix of this bug introduced Bug #24741: existing cascade clauses disappear when adding foreign keys.
[19 Mar 2007 17:32] Stephan Fudeus
Sorry for posting to this long-closed bug, but is there any chance for this issue to be fixed also for 5.0.x?
[19 Mar 2007 17:39] Heikki Tuuri
Stephan,

sorry, no. 5.0 is a 'production' version of MySQL, and no new features are allowed. As you see, this fix caused a serious bug in 5.1.12. We do not want to risk the stability of 5.0.

Regards,

Heikki
[28 Nov 2007 10:07] kenichi yonekawa
Sorry, this bug is really fixed by 5.1?
I tried debugging while watching the dict_sys->size.
The procedure is written as follows. 

1. Starting mysqld from GDB
2. Execute create table query.
3. Watching dict_sys->size value.

MySQL5.0(5.0.46)
 before)
 p dict_sys->size => 19160
 
 after table create)
 p dict_sys->size => 45960

 45960 - 19160 = 26800(heap size of tab_mail_message)

MySQL5.1(5.1.22-rc)
 before)
 p dict_sys->size => 20888

 after table create)
 p dict_sys->size => 82264

 82264 - 20888 = 61376(heap size of tab_mail_message)

It seems not to be fixed like this. 
Please verify it.

The query I tried =>

CREATE TABLE  `test`.`tab_mail_message` (
  `_id` bigint(20) NOT NULL auto_increment,
  `col_abstract_data` char(128) collate utf8_general_ci default NULL,
  `col_abstract_from` char(128) collate utf8_general_ci default NULL,
  `col_abstract_from_email` char(128) collate utf8_general_ci default NULL,
  `col_abstract_from_name` char(128) collate utf8_general_ci default NULL,
  `col_abstract_subject` char(128) collate utf8_general_ci default NULL,
  `col_abstract_to` char(128) collate utf8_general_ci default NULL,
  `col_abstract_to_email` char(128) collate utf8_general_ci default NULL,
  `col_abstract_to_name` char(128) collate utf8_general_ci default NULL,
  `col_action` char(32) collate utf8_general_ci default NULL,
  `col_attached` int(11) default NULL,
  `col_bcc` text collate utf8_general_ci,
  `col_cc` text collate utf8_general_ci,
  `col_confirmation_final_recipient` text collate utf8_general_ci,
  `col_confirmation_org_message_id` char(255) collate utf8_general_ci default NULL,
  `col_confirmation_request` int(11) default NULL,
  `col_confirmation_response` int(11) default NULL,
  `col_confirmation_status` char(32) collate utf8_general_ci default NULL,
  `col_confirmation_to` text collate utf8_general_ci,
  `col_content_type` char(255) collate utf8_general_ci default NULL,
  `col_creator` bigint(20) default NULL,
  `col_creator_foreign_key` char(255) collate utf8_general_ci default NULL,
  `col_creator_name` char(100) collate utf8_general_ci default NULL,
  `col_ctime` int(11) default NULL,
  `col_data` longtext collate utf8_general_ci,
  `col_date` text collate utf8_general_ci,
  `col_draft` int(11) default NULL,
  `col_dtime` int(11) default NULL,
  `col_folder` bigint(20) default NULL,
  `col_from` text collate utf8_general_ci,
  `col_html_data` longtext collate utf8_general_ci,
  `col_in_reply_to` text collate utf8_general_ci,
  `col_message_id` char(255) collate utf8_general_ci default NULL,
  `col_modifier` bigint(20) default NULL,
  `col_modifier_foreign_key` char(255) collate utf8_general_ci default NULL,
  `col_modifier_name` char(100) collate utf8_general_ci default NULL,
  `col_mtime` int(11) default NULL,
  `col_origin_mail` bigint(20) default NULL,
  `col_read_ts` int(11) NOT NULL,
  `col_references` text collate utf8_general_ci,
  `col_reply_to` text collate utf8_general_ci,
  `col_reserve_blob1` blob,
  `col_reserve_blob2` blob,
  `col_reserve_blob3` blob,
  `col_reserve_int1` int(11) default NULL,
  `col_reserve_int2` int(11) default NULL,
  `col_reserve_int3` int(11) default NULL,
  `col_reserve_text1` char(100) collate utf8_general_ci default NULL,
  `col_reserve_text2` char(100) collate utf8_general_ci default NULL,
  `col_reserve_text3` char(100) collate utf8_general_ci default NULL,
  `col_send_ts` int(11) NOT NULL,
  `col_sent` int(11) NOT NULL,
  `col_sign_data` text collate utf8_general_ci,
  `col_signature` bigint(20) default NULL,
  `col_size` int(11) default NULL,
  `col_status` char(64) collate utf8_general_ci NOT NULL,
  `col_subject` text collate utf8_general_ci,
  `col_timestamp` int(11) NOT NULL,
  `col_to` text collate utf8_general_cs,
  `col_unsent` int(11) default NULL,
  `col_user` bigint(20) default NULL,
  PRIMARY KEY  (`_id`),
  KEY `cni_folder` (`col_folder`),
  KEY `cni_origin_mail` (`col_origin_mail`),
  KEY `cni_signature` (`col_signature`),
  KEY `idx_folder_data_n` (`col_folder`,`col_abstract_data`,`_id`),
  KEY `idx_folder_dts_n` (`col_folder`,`col_dtime`,`_id`),
  KEY `idx_folder_from_email_n` (`col_folder`,`col_abstract_from_email`,`_id`),
  KEY `idx_folder_from_n` (`col_folder`,`col_abstract_from`,`_id`),
  KEY `idx_folder_from_name_n` (`col_folder`,`col_abstract_from_name`,`_id`),
  KEY `idx_folder_rts_n` (`col_folder`,`col_read_ts`,`_id`),
  KEY `idx_folder_status` (`col_folder`,`col_status`),
  KEY `idx_folder_sts_n` (`col_folder`,`col_send_ts`,`_id`),
  KEY `idx_folder_subject_n` (`col_folder`,`col_abstract_subject`,`_id`),
  KEY `idx_folder_to_email_n` (`col_folder`,`col_abstract_to_email`,`_id`),
  KEY `idx_folder_to_n` (`col_folder`,`col_abstract_to`,`_id`),
  KEY `idx_folder_to_name_n` (`col_folder`,`col_abstract_to_name`,`_id`),
  KEY `idx_folder_ts_n` (`col_folder`,`col_timestamp`,`_id`),
  KEY `idx_user_messageid` (`col_user`,`col_message_id`),
  KEY `idx_user_status_ts` (`col_user`,`col_status`,`col_timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
[28 Nov 2007 13:33] Heikki Tuuri
Inaam,

can you repeat the latest complaint about 5.1 taking more space?

Marko should have cut the space usage to about 50 % of the original.

Regards,

Heikki
[28 Nov 2007 13:45] Marko Mäkelä
Some of the relevant data structures take significantly more space on 64-bit platforms than on 32-bit, mainly due to different alignment constraints and wider pointers.

I would like to know if Kenichi Yonekawa obtained the results in the same environment, with mysqld compiled in the same way (both 32-bit, or both 64-bit). I am guessing that the difference could be explained by comparing a 32-bit MySQL 5.0 to a 64-bit MySQL 5.1.
[29 Nov 2007 5:14] kenichi yonekawa
Marco,

Sorry, I had forgotten to write the environment. 
The environment was written, and it posted it here. 
http://bugs.mysql.com/32836

Regards.
[29 Nov 2007 6:06] kenichi yonekawa
Oh..., I didn't notice this bug was reopened.
Sorry.
[27 May 2008 15:23] Calvin Sun
both bug#32836 and bug#36613 are marked as duplicates of this one.
[27 May 2008 15:56] Mattias Jonsson
here is what I manage to get when I look at the core file after crashing the server with about 3000 innodb tables with 500 columns (see bug#36613 oom-test case):
dict_sys->size = 3521749104.
[2 Jun 2008 16:31] Ken Jacobs
While we appreciate that there are people running into this issue, and that it is exacerbated by the use of partitioning, unfortunately this is simply not a bug.   There is no memory leak, there is no reported crash related to this issue, and this is not a regression in performance or memory use.  This bug report and related bug reports represent a feature request.

The observed behavior is not a memory leak. (It is a stretch to say that because there is no LRU treatment of meta data there is a "leak", which is an unintentional/undesigned failure to release memory).   InnoDB works as designed and documented.  

There is no regression here (i.e., InnoDB per-table memory use is the same as before).  As noted in bug 32836, some changes were made in 5.1.11 that considerably reduced memory use.

Indeed, as noted in bug 20877, the system works fine for small/normal tables, perhaps up to 10s of thousands of tables.  Alternative schema designs will reduce or eliminate the need for large numbers of tables with large numbers of columns and large numbers of partitions.   

Another "workaround" is to use larger memories (perhaps on a 64-bit machine), providing more RAM (and sufficient swap space) for this meta data.

The design of partitions in MySQL (above the storage engines) means that each storage engine will likely maintain repeated meta data for the underlying tables of identical structure.  It is not clear that a simple LRU mechanism that allows such meta data to page out would be sufficient to address this issue. (Since many of the partitions might be quite active, there might be significant performance problems with this approach.)

Further, there would be substantial development investment and risk in making this sort of change.  It simply cannot happen in the 5.1 (or likely 6.0) time frame.

Therefore we consider this bug report to amount to a feature request (and it should be classified as such).  The InnoDB team will consider making changes in this area for a future release, and we will discuss this topic with Sun/MySQL.
[13 Jun 2008 2:09] Ken Jacobs
This issue may affect all storage engines.  Any "fix" could have significant performance side-effects.  Making changes in the management of data dictionary information would require substantial work and therefore be very risky to include.

Changing Triage information accordingly.
[28 Oct 2008 9:18] Marko Mäkelä
The fix of this bug introduced Bug #40369: dtype_get_sql_null_size() returns 0 or 1, not the size. This affects tables in ROW_FORMAT=REDUNDANT that have fixed-length columns that can be set NULL.
[3 Aug 2009 7:08] Ronald Ellison
I want to point that problem with data dictionary can be used for local DOS attack

any use with create table privileges can run simple bash script

x=1
while [ 1 ]
do
 mysql -e "CREATE TABLE test$x (id int) engine=InnoDB" test
 ((x++))
done

and in short period of time all memory will be used and mysql will be crashed with Out Of Memory problem.

I consider this bug as security vulnerability with possible DOS attack.
[1 Sep 2009 23:51] James Day
Ronald, means by which authorised users can use their privileges to exhaust system resources are not generally considered by us to be denial of service vulnerabilities.
[5 May 2010 15:24] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 5:52] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:21] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:49] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[15 Jun 2010 8:21] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (merge vers: 5.1.47) (pib:16)
[15 Jun 2010 8:38] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (pib:16)
[17 Jun 2010 11:52] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:30] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:18] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[20 Oct 2010 9:53] Alexander Rubin
This bug http://bugs.mysql.com/bug.php?id=57480 shows the similar issue but with only 256 tables
[13 Nov 2010 16:12] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113152450-2zzcm50e7i4j35v7) (merge vers: 5.6.1-m4) (pib:21)
[13 Nov 2010 16:40] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (pib:21)
[27 Apr 2016 12:41] Daniel Price
Posted by developer:
 
Fixed as of the 5.6.2 release, and here's the changelog entry:

To ease the memory load on systems with huge numbers of tables, InnoDB
now frees up the memory associated with an opened table using an LRU
algorithm to select tables that have gone the longest without being
accessed. To reserve more memory to hold metadata for open InnoDB tables,
increase the value of the table_definition_cache configuration option.
InnoDB treats this value as a soft limit for the number of open table
instances in the InnoDB data dictionary cache. The actual number of tables
with cached metadata could be higher than the value specified for
table_definition_cache, because metadata for InnoDB system tables, and
parent and child tables in foreign key relationships, is never evicted
from memory. For additional information, refer to the
table_definition_cache documentation.