MySQL Bugs: #1341: InnoDB ibdata1 never shrinks after data is removed

Bug #1341	InnoDB ibdata1 never shrinks after data is removed
Submitted:	18 Sep 2003 12:40	Modified:	30 Nov 2005 19:50
Reporter:	Scott Ellsworth	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S4 (Feature request)
Version:	All versions	OS:	Any
Assigned to:	Assigned Account	CPU Architecture:	Any

Description:
Description: The Innodb table handler does not reclaim space and return it to the OS once it
allocates it, and it would be very good if it did so. A manual step would meet our needs.

As part of a new product we are developing, we have been loading, manipulating, exporting, and
dropping 5G of data in a single database using a dozen tables with the Innodb table type, and
after dropping the database in which all this data lives, we see that we do not get the disk space
back. This is no big deal for a meg or two, but when the data is measured in gigabytes, this can
be a problem.

By the time we are finished, we will have done this process roughly 60 times, as we need to
eventually create roughly 300 GB of manipulated data.

We expect that our clients will want to do the same someday, and will object to the excess space
never being returned.

We are using a dozen machines, running MacOS X, RH Linux 8 and 9, and Windows
2k. All of them are using 4.0.14.

We have noticed that dropping the database, containing 5G of data, does not cause the ibdata1 file
to shrink, and thus once a machine has been used for this, it continues to take up a lot of disk
space. This is a serious problem.

If an analyze or optimize SQL command would fix it, we would run one happily. We do not need
the space reclaimed automatically, as long as we can get it back once the load process is finished.

This is a high priority feature request for us, as the current fix involves privs the user may not
have. It is quite common in the biotech world for different dbas to have different privs on various
databases. Dumping the table data, physically deleting the ibdata1 file, and reloading can take
weeks or months for some of our data, and further, it requires unix root privs, as the ibdata1 file
cannot be deleted from within mysql. We do not like telling our dbas to work directly with the
filesystem.

I believe this design decision, needs to be looked at again, as it requires end users to have root
access to their filesystem to do maintainence. Auto-shrink may be arguable, in that the space can
be used in the future, but when we know we are permenantly reducing the size of a db, we should
have a means. Otherwise, the practical impact on large table users is extreme.

Consider the use case of a db that has grown to 20GB, and which we would like to split into two
10G tablespaces. (We do this in Oracle on occasion.) With the current design, we would end up
with a 10G tablespace on the new machine, and a half unused 20G tablespace on the original.

How to repeat:
The procedure for one of these 60 load steps:

Create a new database
Create a schema using innodb tables. Ours has roughly fifty tables, but only a dozen have data
Load 5G of data - note that this will take two or three days for our usage pattern
Export the data
Drop the database
Note that ibdata1 is still quite large.

Suggested fix:
The best possible fix would be automatic space reclamation once the data are deleted. It would
only be slightly less convenient if this required an "Optimize Table FOOBAR" to cause.

Second best would be getting the space back when an entire table is dropped.

If we have to drop the entire database, I could accept that. This is, again, a bit of a pain, but not
absolutely awful.

The current solution of "dump every database on the server, delete the file, and reload" is quite
time consuming and requires filesystem access.

Please do not submit the same bug more than once. An existing
bug report already describes this very problem. Even if you feel
that your issue is somewhat different, the resolution is likely
to be the same. Because of this, we hope you add your comments
to the original bug instead.

Thank you for your interest in MySQL.

same as #1287

oops. It was feature request this time, not a bug :)
even though the ticket text was the same...

Hi!

MySQL-4.1.1 partially solves this problem: you can put

innodb_file_per_table

to my.cnf, and then each InnoDB table is plased into its own .ibd file.

But, of course, that .ibd file never shrinks, unless you reorganize the table with

ALTER TABLE ... TYPE=InnoDB;

And the undo logs are still stored in ibdata files, and those files never shrink in 4.1.1 either :).

Regards,

Heikki

This is good news, as this will solve the most common problems I had with the ever growing ibdata 
files.

Could you add a comment to the manual that alter table type=innodb on a current innodb table 
will compact the idb files?  The logs are, at least, fixed in size and less of a problem, if I 
understand correctly.

I still want it all :-), but this is a very big step in the right direction that solves my most difficult 
problem.

Scott

I would like to add to the list of people who would like to shrink the ibdata1 file without duming the database and reloading it.  Our server has more than a hundred databases, each of which is used by different users, none of whom have shell access to the server.  We have already used the innodb_file_per_table which fixes a lot of the problem, but we are still stuck with a more than 40 gigabyte ibdata1 file that existed before we started using the innodb_file_per_table option.  Our dataset has grown significantly since then, and it would take hours to dump all the data and then restore it, during which time more than 100 web-based applications would be off-line.

I will stipulate that a one-time dump and reload of the data would fix the problem, but that's a lot more work and headache than it sounds like for me.  :)

Just a feature request!

I have just suceeded in filling up my hdd, and can't for the life of be figure out how to get my space back without deleting the 7Gb ibdata1 file. I vote for a fix!

I just have very few datas(about 1000 records) and my ibdata1 is already 300MB!
Doesn't mysql is suppose to be clean and fast?
Can't belive there is a issue like this;(

Changing status to verified as feature request.

Just to add my $0.02, I have just run an ALTER TABLE... ENGINE=InnoDB on all my InnoDB files and my ibdata1 file went *up* in size by 200Mb... This is on 4.1.12.

Colin,

using the my.cnf option:

innodb_file_per_table

would help in disk space management.

In ALTER TABLE, MySQL rebuilds the whole table. That will temporarily cause more space usage in ibdata files, and may cause them to extend. The above my.cnf option would remove that problem.

Regards,

Heikki

Hello, I've followed the recommended instructions for this, but like everyone else I now have suitably sized small files for every table, and still one giant data file that needs to be dropped.  What can I do?

Ty,

unfortunately, the only way to shrink ibdata files is to reload the entire database.

Regards,

Heikki

Well i was stupid enough to NOT include the one-file-per-table and then created a a test schema with lots of stuff in it.. once i was finished testing i dropped it... then noticed i had about 2gb disk space left.. (was > 30 before).. now how do i get it back?

Can't 'optimize' since schema is gone, can't delete innodb since it contains other stuff... so the only way is to backup ALL other schmas, tunr on one-file-per-table and then restore them all, just to get the space of the deleted schma back? that's crazy! where is my "shrink all"???

Ouch. That's pretty bad, I have an almost-full /var partition, I dropped several unused MySQL databases but didn't reclaim almost any space!

Bug #45173 was marked as duplicate of this one.

If the only way to shrink the innodb data file is to to a mysqldump, how about 

I copy all my .ibd files, then

ALTER TABLE tbl_name DISCARD TABLESPACE;

then did a full mysqldump, erased the huge innodb data file, then restored from the dump, then tried to import the .ibd files with

ALTER TABLE tbl_name IMPORT TABLESPACE;

I would also vote to get this issue resolve ASAP.

I currently have a 1 Terabyte ibdata1 file.  Pretty soon the ibdata file will be larger than any single disk can hold.  Either this feature needs to be implemented, or the documentation needs to state that InnoDB is not meant for large datasets.

"Submitted: 	18 Sep 2003 14:40"

7 years to Triage one verified bug?
How many more years to fix it?  With the amount of data being stored in databases ever increasing this kind of thing is becoming more and more critical, and making MySQL less and less suitable.  Do we have any hope of an update or fix on this?

I'm in the same situation currently.  My ibdata1 file is 20Gb and my actual innodb database content is 17Gb.  I need that 20Gb back, and it will take down time I can't really afford to dump the entire database and restore it.

Paul,

please try InnoDB Plugin, it returns space if TRUNCATE TABLE used: http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-other-changes-truncate.html

I think this is my favorite feature request ever. The content is classic. So I must continue in that fashion.

InnoDB doesn't do hash joins. Therefore it isn't fit for large data.

@Sveta: If I understand it correctly InnoDB plugin only releases space if you truncate a table that already had it's own .ibd file. It doesn't enable you to reduce the size of the ibdata1 file.

This is a serious problem -- whether or not it is a bug or a bad design is irrelevant.  This problem, combined with the fact that the only supported backup option is so painfully slow, means that I'm often forced to copy the ibdata file as my "backup" and replace it with a copy of the ibdata file that I made when the database was empty.  It's not portable, and it fails if any settings (or some particular settings) in the my.ini file have changed since I made the copy of the ibdata file.

+1 on this. Clearly not an easy fix for the dev team, but in 7 years we'd hope that we could get somewhere. Currently running 1 billion rows with file_per_table and my innodb1 file still expands (even after archiving 100M's of rows). Only way to get this down is to rebuild the system from scratch. 

Please consider starting to look at this.

For those running with innodb_file_per_table and a growing ibdata1 file, I wonder if http://bugs.mysql.com/bug.php?id=57611 is the cause.

Thanks Mark. Could well be that. Will take a closer look today and leave feedback.

As a last resort, I would offer the workaround I used to enable innodb_file_per_table option without shutting down the database. The idea is: 

    * Configure your original database as master. Unless your database is already using binlogs for security, this is the only step that will require restarting MySQL.
    * Make a backup of the original database using Xtrabackup.
    * Restore the backup and run a second instance of MySQL.
    * Run mysqldump on the second instance.
    * Stop the second instance, but do not delete it yet.
    * Create a new database and start the third instance of MySQL with the enabled option innodb_file_per_table.
    * Restore the dump by feeding it into the third instance of MySQL.
    * Configure the third instance as slave and run the replication.
    * When the initial replication finishes and the slave catches up with the master, reconfigure your clients to use the new instance.
    * That's it. You can stop the first instance now and delete it.

A detailed article can be found here: http://www.linux.com/learn/tutorials/380876-howto-reconfigure-mysql-to-use-innodbfileperta...

I actually am really flabbergasted at this issue.
How can this be laying around for almost 8 years?!
I know MySQL is open source, but come on!
This is a real showstopper for using MySQL in corporate environments.

So let me add my voice on top of all others: Please make a space reclaim feature!

innodb_file_per_table is not a solution. If you set this option you will occur in #51325 ( very very slow drop table and alter table, at least in big instances, with the whole server locked).

I have the usual problem of all schemas using the same ibdata1 tablespace which has now reached 30 gig and wish to move to the innodb_file_per_table scenario . Has anyone managed to swap all schemas over including the system ones in order to delete/remove ibdata1. 

I have used the 'alter table tblname ENGINE=InnoDB  to swith the tables over, I am also using partitioning, but am unsure what to do with the system tables and still need to know what to do with the original ibdata1 file, as until this is removed I am using 'double the space' ?

many thanks 

Richard

Aplogises for multiple posts as CAPTCI kept on failing so thought not posted !!!

This bug report shouldn't really be used to give advice etc but even so:
1. The easiest way is to mysqldump the whole instance and reload into a clean instance. That generates downtime.
2. If you can't afford the downtime of the dump + load build yourself a slave, fix the slave and then promote it to be the main instance. Luckily 30GB is not that large.  This reduces the time to make the copy which you work on while the main instance still runs.
3. ibdata1 will never be empty but it will stay quite small most of the time once you've done steps 1 or 2 above. You can't explicitly move the internal InnoDB areas in ibdata1 anywhere else.

NINE YEARS and we can't get something that will re-claim the unused space?

Can't Oracle devote a little time to this SERIOUS BUG?

I use Virtual Machines for all my dev work, and the vast majority of space the VM takes up is the ibdata1 file. Mine right now is about 5GB. I would guess there isn't more than 1GB of actual data that I use given the size of .sql dumps.

Ideally the ibdata should be dynamic and shrink by itself, but hell, at this point, I will take a command line tool or "ALTER" statement that manually consolidates and removes the cruft.

I really don't understand why the "recommended workaround" for this is to do a mysqldump and reload, when people are complaining about the extremely long time that takes on larges databases. That's the issue. If the DB were small, this wouldn't be as much of an issue. So to recommend reloading as a "solution" is not helpful. I guess "open source" in this case means "we don't care". If that's the case, then, as it's been said on this post -- "mysql is not meant for large databases".

bug #67836 is a duplicate

10 years and this bug/feature request still remains unsolved.

Agreed that this is still an issue and 10 years is RIDICULOUS.

Ten years is because we know it is desired but it is a hard problem. To free the space takes moving lots of internal pointers and data. That is almost as much work as alter table, but in theory it could be done gradually in the background. But would be very complicated with high chance for bugs and performance problems due to the locking needed.

With 5.6 the best combination that is implemented so far is:

1. Innodb_file_per_table, on by default in 5.6. Prevents data from going to shared tablespace.

2. Undo logs in their own tablespace, see http://dev.mysql.com/doc/refman/5.6/en/innodb-performance.html Separate Tablespaces for InnoDB Undo Logs . This will prevent undo logs from causing the shared tablespace to grow, but there is still no way to shrink the undo logs.

3. Shared tablespace then contains data dictionary, change buffer and statistics, so is much less likely to grow to big sizes.

4. For some workloads the compressed tables feature will help to reduce data sizes and hence the times. Still some way to go on performance of this for OLTP but it's way better in 5.6 than 5.5, in part due to many improvements suggested by Facebook. Best used in general for tables that don't have the highest change rates but do have big column sizes that compress well.

This definitely does not solve all problems:

5. The ibd files can have free space and the only way to release that is slow alter table or truncate. Slow for big tables, not practical on a live server. But is practical on a system that has failover capability. Can take a window of slow failover time and do it during this window with a known risk that there might be slower failover than usual. If you do this, drop all the non-unique secondary indexes then add them back later, the fast index recreate will save you time and rebuild the indexes with less free space inside the pages.

6. The portable tablespaces feature in 5.6 could then be used to load the tablespaces into a new copy of Mysql with small shared tablespace and redo log. This still requires downtime, so still requires a failover solution, but it's far faster for big tables than mysqldump and reload.

7. Dropping tables should be faster in 5.6 and 5.5.20 or later but it's still going to be slower using innodb_file_per_table than having the tables in the shared tablespace. More practical for big tables that aren't dropped often than for small tables that are very regularly dropped. For the big/infrequent combination, most people already use innodb_file_per_table. The work on this in bug http://bugs.mysql.com/bug.php?id=51325 helps a lot but there's still scope for more.

So we know it's desirable, some major improvements that help the workarounds, but still no way to make it practical online. For now, failover based workarounds are still the way to go. Not ideal, but at least doable.

James Day, MySQL Senior Principal Support Engineer, Oracle

Any Progress about this?
Will Mysql 5.7 include this featture?

Any Progress about this?
Will Mysql 5.7 include this featture?

The possible of doing a crosscheck (and automatic discard) between Mysql dictionary and "tablespaces" would be very useful.

Hi!

I cannot speak about projected version numbers or time frames, but I can tell you that we are working on this issue.

If you have a MySQL Enterprise subscription, I would encourage you to open a support ticket about the feature. We can then note your interest, track the status for you, and provide you with updates.

Thank you for your interest in MySQL!

Best Regards,

Matt

In the development 5.7.5 server we're hoping to release the innodb_undo_log_truncate setting. When using undo log files outside the shared tablespace this options causes them to be periodically truncated, freeing up the disk space that they take.

This doesn't assist those who already have their undo logs in the shared tablespace but it will be useful to those who have exploited the option to put the redo log files outside the shared tablespace in 5.6 and later. Documentation is lagging a little and for now all you can see is the mention in the 5.7.5 release note at http://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-5.html .

James Day, MySQL Senior Principal Support Engineer, Oracle

James, that's good news.

However, it would be _really_ nice to have a procedure to allow (with mysqld down, as doing it online I guess is pretty hard) a rebuild of ibdata1, to clean out stuff.

If using innodb_file_per_table and if the server has been shutdown cleanly then ibdata1 is not expected to be very large, so a rebuild procedure which builds a new ibdata1 and replaces the existing file would be most helpful to us all.  That would be a time where the undo log information could be migrated to a new tablespace, something which can't be done on existing systems, as you can't migrate this on an existing system according to: http://dev.mysql.com/doc/refman/5.7/en/innodb-undo-tablespace.html

I see you have things like this: http://dev.mysql.com/doc/refman/5.7/en/innochecksum.html so maybe a couple of extra utilities which allow for the "shutdown server" to be reconfigured would be really helpful and allow us to free up space that unintentionally (or by innodb design) made the ibdata1 file grow.

I have several instances with a ibdata1 file which grew due to undo log info and every time I clone these instances I have to clone an extra 300GB of useless data.  The servers are large and doing a dump + clean reload is not really practical and the number of servers affected is quite high, so being able to take advantage of a 5.6 to 5.7 upgrade and run a innodb_cleanup_ibdata script which I'd expect to probably take a few seconds would be something to easily incorporate into our upgrade procedures.

That would make life great, and allow me to using new features, save on disk space and do that in an upgrade window which I control.

Simon, yes, I know it would be nice.

Run time with the server down would not be short. To do it would require moving all used pages into the space that would remain and also updating all of the pointers into those used pages. If the space use is significant, the time for that amount of IO would also be significant. Unless innodb_fast_shutdown was set to 0 we'd probably also require setting it to 0 instead of the usual 1 before this to avoid the change buffer related work.

That also doesn't help those who have the drop table performance reason for wanting to keep tables in the shared tablespace.

I have a challenge with this sort of thing. We know that most people have used innodb_file_per_table for years now. So we know that the population of servers that has a lot of data in the shared tablespace is shrinking over time. The change to default in 5.6 should accelerate that, though we deliberately waited until most people already were using innodb_file_per_table. We also know that the percentage of servers that have undo logs in the shared tablespace is dropping now those can be put outside the shared tablespace.

Say I could advocate such a tool or instead advocate better tablespace management. I know there are people who want to put all data for one customer on a shared host in one tablespace file, or want that per database or arbitrarily or want that and ease of moving tables between tablespaces and tables or tablespaces between servers for things like load balancing. Or we can add more improvements to data recovery for those unfortunate enough to have no or damaged backups.

I know that we can do things that can reduce the cost of dropping a table outside the shared tablespace. That requires a scan of the buffer pool(s) each time but we could do that in the background and perhaps for several tables at once. We could also do things like assigning individual tables to different buffer pools to speed up the scan by skipping most of the buffer pool size, limiting it to only one pool. Those things might eliminate that reason to want tables in the shared tablespace.

For myself, I choose to advocate things like those before a tool like the one you described. I think that would deliver benefits to more people and also make it easier for those who can move away from the one shared tablespace to do that. Yet I know that those choices will frustrate those who do still have data in the shared tablespace and don't want to or for some reason can't move it to another server. And of course I'm nowhere near close to being the only one who has priorities - Matt Lord in particular is the server product manager.

But notice that I wrote about moving tables between tablespaces? That will leave space in the original tablespace, so what do we do about that? At some point we'll want to provide a way to free that space. And that may be adaptable as part of a similar capability for the shared tablespace. So eventually we may get to the sort of tool that you're thinking of, but applicable to a broader group of use cases.

Of course none of that means I disagree with you or others who have commented here. I don't, I agree we should make this better. But I also have to choose what I think should be done first, usually on the basis of trying to benefit the greatest number or those at risk of data loss first.

For your cloning problem, at least for now, what I suggest is that you:

1. Exploit the ability to put an individual InnoDB table wherever you want to mount the filesystem of the next clone on a source machine, then use ALTER TABLE to move each table in turn out of the shared tablespace on the source box to the mounted disk of the destination box.
2. Once that is done, export the tablespaces.
3. Copy across any non-InnoDB data.
4. Now create a new MySQL instance on the to be clone server and set it up with innodb_file_per_table and the undo logs outside the shared tablespace. Start it up and import the tables that are already on its disk.
5. Test.
6. Then clone in reverse from this new clone back to the source machine.

This takes at least twice as much time as one clone (more random accesses) but leaves you with two machines that have the small shared tablespace and you can forget about that problem happening again for either. Eventually you can clone from these to the others as time allows.

These of course are just my own views and illustrations of the sort of trade-offs that I make when deciding what to advocate. Nothing here is a commitment to us doing anything or doing it in any particular timeframe. It's just how I'm thinking and why. That's not the official Oracle view, just mine.

James Day, MySQL Senior Principal Support Engineer, Oracle

We've announced several new 5.7 labs releases. One of those includes broader support of tablespaces for InnoDB. As usual with a lot going on the documentation has not yet been fully updated to reflect all of the changes, expect that to take a few weeks. In addition it's likely that there will be related presentations at OpenWorld.

This should further reduce the need to have data tables stored within the shared tablespace, as well as making it easier to manage sets of tables.

The announcement is at http://mysqlserverteam.com/mysql-labs-releases-openworld-2014/ :

"InnoDB: General Tablespaces

We showcase CREATE TABLESPACE for general use. This will give the user freedom to choose the mapping between tables and tablespaces, i.e. which tablespaces to create and what tables they should contain. This allows for doing things such as grouping all tables for a given user or customer within a single tablespace, and thus having all of their data within a single file on the filesystem."

Please check out the labs release as soon as practical so that we'll have a chance to incorporate feedback before the next production release of the server. As with the DMRs, strictly the presence of a feature in a labs release does not commit us to having it in a production release, so don't count on a feature being present until it does actually show up in the GA release.

James Day, MySQL Senior Principal Support Engineer, Oracle

I've just come across this post on the blog:

http://mysqlserverteam.com/online-truncate-of-innodb-undo-tablespaces/

Does this mean this bug can be marked as resolved?

Sadly no, it's not resolved by that. That improvement and the previous change in 5.6 to allow moving the undo data to different files helps but there are still good reasons for people to want to use the shared tablespace for data or, if they move the data out, to have a good way to free the space in the filesystem.

The main good reason to use the shared tablespace for data today is the performance of drop table, which is worse when the data is in its own *.ibd file. So places that create and drop lots of small tables can be particularly badly affected if they don't keep data in the shared tablespace. This effect is worse when the InnoDB buffer pool is bigger because a scan of the whole buffer pool is needed.

James Day, MySQL Senior Principal Support Engineer, Oracle

It looks as though the drop table difference was removed in 5.6. I have some more work to do on that, possibly including a blog post to illustrate the changes through the various server versions. I'll confirm and update when I'm done.

James Day, MySQL Senior Principal Support Engineer, Oracle

The now-GA 5.7 provides a feature to create general tablespaces anywhere you like and store multiple tables in them, as well as move individual tables between tablespaces, including per-file tablespaces and the system tablespace. Described at https://dev.mysql.com/doc/refman/5.7/en/general-tablespaces.html . As with the shared tablespace, once space has been allocated from the filesystem there is no way to release it. If you need that capability, use per-table tablespaces instead, those do fully release the space allocated to a table when it is dropped.

This provides no way to make the shared tablespace smaller but it can be used to reduce the pain of emptying it by moving tables to a different common tablespace. Combine that with putting the undo log files in a different place, as they are now by default, and with temporary tables going to their own dedicated tablespace and there will be relatively little left in the shared tablespace.

Even so, the process of moving tables and doing a new install to get rid of the disk space allocated to the shared tablespace is painful for installations that do have lots of data or redo there.

James Day, MySQL Senipr Principal Support Engineer, Oracle

Have you considered making the ibddata1 and similar files sparse?
As far as I can see, even windows support sparse files.
This would, of course, make the most sense if unused portions of the files are zeroed, but that should be possible to implement, if not already?

Best regards,
Henrik Juul Pedersen, Embedded Software Engineer, LIAB (Linux In A Box)

Our plans for MySQL 8 in this area are described at https://mysqlserverteam.com/mysql-8-0-2-more-flexible-undo-tablespace-management/ . Briefly, we plan to move any undo space out of the shared tablespace and turn on the automatic truncating process introduced in 5.7 by default. This way a big transaction may use a lot of disc space but only temporarily. This protects currently unaffected installations against ever suffering from one of the most common big ibdata1 annoyances, one made big by undo log / rollback segment entries.

This doesn't address the general ability to shrink the tablespace, though, just one of the causes of the need for that.

In general, we're gradually effectively eliminating almost all traditional uses of ibdata, replacing what it has with more capable features storing their information somewhere else.

James Day, MySQL Senior Principal Support Engineer, Oracle