Bug #1341 | InnoDB ibdata1 never shrinks after data is removed | ||
---|---|---|---|
Submitted: | 18 Sep 2003 12:40 | Modified: | 30 Nov 2005 19:50 |
Reporter: | Scott Ellsworth | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S4 (Feature request) |
Version: | All versions | OS: | Any |
Assigned to: | Assigned Account | CPU Architecture: | Any |
[18 Sep 2003 12:40]
Scott Ellsworth
[18 Sep 2003 13:19]
Sergei Golubchik
Please do not submit the same bug more than once. An existing bug report already describes this very problem. Even if you feel that your issue is somewhat different, the resolution is likely to be the same. Because of this, we hope you add your comments to the original bug instead. Thank you for your interest in MySQL. same as #1287
[18 Sep 2003 13:20]
Sergei Golubchik
oops. It was feature request this time, not a bug :) even though the ticket text was the same...
[29 Jan 2004 12:36]
Heikki Tuuri
Hi! MySQL-4.1.1 partially solves this problem: you can put innodb_file_per_table to my.cnf, and then each InnoDB table is plased into its own .ibd file. But, of course, that .ibd file never shrinks, unless you reorganize the table with ALTER TABLE ... TYPE=InnoDB; And the undo logs are still stored in ibdata files, and those files never shrink in 4.1.1 either :). Regards, Heikki
[2 Feb 2004 10:39]
Scott Ellsworth
This is good news, as this will solve the most common problems I had with the ever growing ibdata files. Could you add a comment to the manual that alter table type=innodb on a current innodb table will compact the idb files? The logs are, at least, fixed in size and less of a problem, if I understand correctly. I still want it all :-), but this is a very big step in the right direction that solves my most difficult problem. Scott
[18 Aug 2005 23:42]
Tim Gustafson
I would like to add to the list of people who would like to shrink the ibdata1 file without duming the database and reloading it. Our server has more than a hundred databases, each of which is used by different users, none of whom have shell access to the server. We have already used the innodb_file_per_table which fixes a lot of the problem, but we are still stuck with a more than 40 gigabyte ibdata1 file that existed before we started using the innodb_file_per_table option. Our dataset has grown significantly since then, and it would take hours to dump all the data and then restore it, during which time more than 100 web-based applications would be off-line. I will stipulate that a one-time dump and reload of the data would fix the problem, but that's a lot more work and headache than it sounds like for me. :) Just a feature request!
[11 Nov 2005 16:09]
Tim Abell
I have just suceeded in filling up my hdd, and can't for the life of be figure out how to get my space back without deleting the 7Gb ibdata1 file. I vote for a fix!
[24 Nov 2005 15:12]
pan li
I just have very few datas(about 1000 records) and my ibdata1 is already 300MB! Doesn't mysql is suppose to be clean and fast? Can't belive there is a issue like this;(
[30 Nov 2005 19:50]
MySQL Verification Team
Changing status to verified as feature request.
[4 Jan 2006 11:11]
Colin Guthrie
Just to add my $0.02, I have just run an ALTER TABLE... ENGINE=InnoDB on all my InnoDB files and my ibdata1 file went *up* in size by 200Mb... This is on 4.1.12.
[4 Jan 2006 16:52]
Heikki Tuuri
Colin, using the my.cnf option: innodb_file_per_table would help in disk space management. In ALTER TABLE, MySQL rebuilds the whole table. That will temporarily cause more space usage in ibdata files, and may cause them to extend. The above my.cnf option would remove that problem. Regards, Heikki
[5 Jun 2006 21:04]
Ty Schalter
Hello, I've followed the recommended instructions for this, but like everyone else I now have suitably sized small files for every table, and still one giant data file that needs to be dropped. What can I do?
[14 Jun 2006 16:14]
Heikki Tuuri
Ty, unfortunately, the only way to shrink ibdata files is to reload the entire database. Regards, Heikki
[27 Aug 2008 12:46]
Frank Osterberg
Well i was stupid enough to NOT include the one-file-per-table and then created a a test schema with lots of stuff in it.. once i was finished testing i dropped it... then noticed i had about 2gb disk space left.. (was > 30 before).. now how do i get it back? Can't 'optimize' since schema is gone, can't delete innodb since it contains other stuff... so the only way is to backup ALL other schmas, tunr on one-file-per-table and then restore them all, just to get the space of the deleted schma back? that's crazy! where is my "shrink all"???
[19 Dec 2008 13:18]
Daniel Serodio
Ouch. That's pretty bad, I have an almost-full /var partition, I dropped several unused MySQL databases but didn't reclaim almost any space!
[29 Jun 2009 6:31]
Sveta Smirnova
Bug #45173 was marked as duplicate of this one.
[11 Sep 2009 19:34]
Mike McCloskey
If the only way to shrink the innodb data file is to to a mysqldump, how about I copy all my .ibd files, then ALTER TABLE tbl_name DISCARD TABLESPACE; then did a full mysqldump, erased the huge innodb data file, then restored from the dump, then tried to import the .ibd files with ALTER TABLE tbl_name IMPORT TABLESPACE;
[18 Dec 2009 11:54]
Pratik Thakkar
I would also vote to get this issue resolve ASAP.
[23 Mar 2010 7:58]
Philip Tellis
I currently have a 1 Terabyte ibdata1 file. Pretty soon the ibdata file will be larger than any single disk can hold. Either this feature needs to be implemented, or the documentation needs to state that InnoDB is not meant for large datasets.
[16 Jun 2010 17:00]
Paul Graydon
"Submitted: 18 Sep 2003 14:40" 7 years to Triage one verified bug? How many more years to fix it? With the amount of data being stored in databases ever increasing this kind of thing is becoming more and more critical, and making MySQL less and less suitable. Do we have any hope of an update or fix on this? I'm in the same situation currently. My ibdata1 file is 20Gb and my actual innodb database content is 17Gb. I need that 20Gb back, and it will take down time I can't really afford to dump the entire database and restore it.
[16 Jun 2010 18:15]
Sveta Smirnova
Paul, please try InnoDB Plugin, it returns space if TRUNCATE TABLE used: http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-other-changes-truncate.html
[14 Jul 2010 12:41]
Mark Callaghan
I think this is my favorite feature request ever. The content is classic. So I must continue in that fashion. InnoDB doesn't do hash joins. Therefore it isn't fit for large data.
[29 Jul 2010 23:23]
Bart van Bragt
@Sveta: If I understand it correctly InnoDB plugin only releases space if you truncate a table that already had it's own .ibd file. It doesn't enable you to reduce the size of the ibdata1 file.
[21 Oct 2010 13:48]
Dan Phifer
This is a serious problem -- whether or not it is a bug or a bad design is irrelevant. This problem, combined with the fact that the only supported backup option is so painfully slow, means that I'm often forced to copy the ibdata file as my "backup" and replace it with a copy of the ibdata file that I made when the database was empty. It's not portable, and it fails if any settings (or some particular settings) in the my.ini file have changed since I made the copy of the ibdata file.
[28 Oct 2010 2:49]
Daniel Fiske
+1 on this. Clearly not an easy fix for the dev team, but in 7 years we'd hope that we could get somewhere. Currently running 1 billion rows with file_per_table and my innodb1 file still expands (even after archiving 100M's of rows). Only way to get this down is to rebuild the system from scratch. Please consider starting to look at this.
[28 Oct 2010 3:56]
Mark Callaghan
For those running with innodb_file_per_table and a growing ibdata1 file, I wonder if http://bugs.mysql.com/bug.php?id=57611 is the cause.
[28 Oct 2010 7:00]
Daniel Fiske
Thanks Mark. Could well be that. Will take a closer look today and leave feedback.
[23 Nov 2010 20:13]
Dmitri Minaev
As a last resort, I would offer the workaround I used to enable innodb_file_per_table option without shutting down the database. The idea is: * Configure your original database as master. Unless your database is already using binlogs for security, this is the only step that will require restarting MySQL. * Make a backup of the original database using Xtrabackup. * Restore the backup and run a second instance of MySQL. * Run mysqldump on the second instance. * Stop the second instance, but do not delete it yet. * Create a new database and start the third instance of MySQL with the enabled option innodb_file_per_table. * Restore the dump by feeding it into the third instance of MySQL. * Configure the third instance as slave and run the replication. * When the initial replication finishes and the slave catches up with the master, reconfigure your clients to use the new instance. * That's it. You can stop the first instance now and delete it. A detailed article can be found here: http://www.linux.com/learn/tutorials/380876-howto-reconfigure-mysql-to-use-innodbfileperta...
[17 Aug 2011 11:57]
Zouhair Himdi
I actually am really flabbergasted at this issue. How can this be laying around for almost 8 years?! I know MySQL is open source, but come on! This is a real showstopper for using MySQL in corporate environments. So let me add my voice on top of all others: Please make a space reclaim feature!
[29 Aug 2011 10:57]
Luca Visconti
innodb_file_per_table is not a solution. If you set this option you will occur in #51325 ( very very slow drop table and alter table, at least in big instances, with the whole server locked).
[23 Jan 2012 13:31]
Richard Hudson
I have the usual problem of all schemas using the same ibdata1 tablespace which has now reached 30 gig and wish to move to the innodb_file_per_table scenario . Has anyone managed to swap all schemas over including the system ones in order to delete/remove ibdata1. I have used the 'alter table tblname ENGINE=InnoDB to swith the tables over, I am also using partitioning, but am unsure what to do with the system tables and still need to know what to do with the original ibdata1 file, as until this is removed I am using 'double the space' ? many thanks Richard
[23 Jan 2012 13:38]
Richard Hudson
Aplogises for multiple posts as CAPTCI kept on failing so thought not posted !!!
[23 Jan 2012 13:39]
Simon Mudd
This bug report shouldn't really be used to give advice etc but even so: 1. The easiest way is to mysqldump the whole instance and reload into a clean instance. That generates downtime. 2. If you can't afford the downtime of the dump + load build yourself a slave, fix the slave and then promote it to be the main instance. Luckily 30GB is not that large. This reduces the time to make the copy which you work on while the main instance still runs. 3. ibdata1 will never be empty but it will stay quite small most of the time once you've done steps 1 or 2 above. You can't explicitly move the internal InnoDB areas in ibdata1 anywhere else.
[21 Aug 2012 22:32]
Daevid Vincent
NINE YEARS and we can't get something that will re-claim the unused space? Can't Oracle devote a little time to this SERIOUS BUG? I use Virtual Machines for all my dev work, and the vast majority of space the VM takes up is the ibdata1 file. Mine right now is about 5GB. I would guess there isn't more than 1GB of actual data that I use given the size of .sql dumps. Ideally the ibdata should be dynamic and shrink by itself, but hell, at this point, I will take a command line tool or "ALTER" statement that manually consolidates and removes the cruft.
[28 Sep 2012 16:46]
Ken Hanks
I really don't understand why the "recommended workaround" for this is to do a mysqldump and reload, when people are complaining about the extremely long time that takes on larges databases. That's the issue. If the DB were small, this wouldn't be as much of an issue. So to recommend reloading as a "solution" is not helpful. I guess "open source" in this case means "we don't care". If that's the case, then, as it's been said on this post -- "mysql is not meant for large databases".
[3 Jan 2013 8:09]
MySQL Verification Team
bug #67836 is a duplicate
[25 Jan 2013 16:32]
Gleidson Gleidson
10 years and this bug/feature request still remains unsolved.
[1 May 2013 23:03]
Colleen Ross
Agreed that this is still an issue and 10 years is RIDICULOUS.
[23 Jul 2013 21:29]
James Day
Ten years is because we know it is desired but it is a hard problem. To free the space takes moving lots of internal pointers and data. That is almost as much work as alter table, but in theory it could be done gradually in the background. But would be very complicated with high chance for bugs and performance problems due to the locking needed. With 5.6 the best combination that is implemented so far is: 1. Innodb_file_per_table, on by default in 5.6. Prevents data from going to shared tablespace. 2. Undo logs in their own tablespace, see http://dev.mysql.com/doc/refman/5.6/en/innodb-performance.html Separate Tablespaces for InnoDB Undo Logs . This will prevent undo logs from causing the shared tablespace to grow, but there is still no way to shrink the undo logs. 3. Shared tablespace then contains data dictionary, change buffer and statistics, so is much less likely to grow to big sizes. 4. For some workloads the compressed tables feature will help to reduce data sizes and hence the times. Still some way to go on performance of this for OLTP but it's way better in 5.6 than 5.5, in part due to many improvements suggested by Facebook. Best used in general for tables that don't have the highest change rates but do have big column sizes that compress well. This definitely does not solve all problems: 5. The ibd files can have free space and the only way to release that is slow alter table or truncate. Slow for big tables, not practical on a live server. But is practical on a system that has failover capability. Can take a window of slow failover time and do it during this window with a known risk that there might be slower failover than usual. If you do this, drop all the non-unique secondary indexes then add them back later, the fast index recreate will save you time and rebuild the indexes with less free space inside the pages. 6. The portable tablespaces feature in 5.6 could then be used to load the tablespaces into a new copy of Mysql with small shared tablespace and redo log. This still requires downtime, so still requires a failover solution, but it's far faster for big tables than mysqldump and reload. 7. Dropping tables should be faster in 5.6 and 5.5.20 or later but it's still going to be slower using innodb_file_per_table than having the tables in the shared tablespace. More practical for big tables that aren't dropped often than for small tables that are very regularly dropped. For the big/infrequent combination, most people already use innodb_file_per_table. The work on this in bug http://bugs.mysql.com/bug.php?id=51325 helps a lot but there's still scope for more. So we know it's desirable, some major improvements that help the workarounds, but still no way to make it practical online. For now, failover based workarounds are still the way to go. Not ideal, but at least doable. James Day, MySQL Senior Principal Support Engineer, Oracle
[24 Apr 2014 13:31]
DB Support
Any Progress about this? Will Mysql 5.7 include this featture?
[24 Apr 2014 13:33]
DB Support
Any Progress about this? Will Mysql 5.7 include this featture? The possible of doing a crosscheck (and automatic discard) between Mysql dictionary and "tablespaces" would be very useful.
[24 Apr 2014 18:52]
Matthew Lord
Hi! I cannot speak about projected version numbers or time frames, but I can tell you that we are working on this issue. If you have a MySQL Enterprise subscription, I would encourage you to open a support ticket about the feature. We can then note your interest, track the status for you, and provide you with updates. Thank you for your interest in MySQL! Best Regards, Matt
[5 Sep 2014 2:23]
James Day
In the development 5.7.5 server we're hoping to release the innodb_undo_log_truncate setting. When using undo log files outside the shared tablespace this options causes them to be periodically truncated, freeing up the disk space that they take. This doesn't assist those who already have their undo logs in the shared tablespace but it will be useful to those who have exploited the option to put the redo log files outside the shared tablespace in 5.6 and later. Documentation is lagging a little and for now all you can see is the mention in the 5.7.5 release note at http://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-5.html . James Day, MySQL Senior Principal Support Engineer, Oracle
[5 Sep 2014 8:42]
Simon Mudd
James, that's good news. However, it would be _really_ nice to have a procedure to allow (with mysqld down, as doing it online I guess is pretty hard) a rebuild of ibdata1, to clean out stuff. If using innodb_file_per_table and if the server has been shutdown cleanly then ibdata1 is not expected to be very large, so a rebuild procedure which builds a new ibdata1 and replaces the existing file would be most helpful to us all. That would be a time where the undo log information could be migrated to a new tablespace, something which can't be done on existing systems, as you can't migrate this on an existing system according to: http://dev.mysql.com/doc/refman/5.7/en/innodb-undo-tablespace.html I see you have things like this: http://dev.mysql.com/doc/refman/5.7/en/innochecksum.html so maybe a couple of extra utilities which allow for the "shutdown server" to be reconfigured would be really helpful and allow us to free up space that unintentionally (or by innodb design) made the ibdata1 file grow. I have several instances with a ibdata1 file which grew due to undo log info and every time I clone these instances I have to clone an extra 300GB of useless data. The servers are large and doing a dump + clean reload is not really practical and the number of servers affected is quite high, so being able to take advantage of a 5.6 to 5.7 upgrade and run a innodb_cleanup_ibdata script which I'd expect to probably take a few seconds would be something to easily incorporate into our upgrade procedures. That would make life great, and allow me to using new features, save on disk space and do that in an upgrade window which I control.
[6 Sep 2014 4:22]
James Day
Simon, yes, I know it would be nice. Run time with the server down would not be short. To do it would require moving all used pages into the space that would remain and also updating all of the pointers into those used pages. If the space use is significant, the time for that amount of IO would also be significant. Unless innodb_fast_shutdown was set to 0 we'd probably also require setting it to 0 instead of the usual 1 before this to avoid the change buffer related work. That also doesn't help those who have the drop table performance reason for wanting to keep tables in the shared tablespace. I have a challenge with this sort of thing. We know that most people have used innodb_file_per_table for years now. So we know that the population of servers that has a lot of data in the shared tablespace is shrinking over time. The change to default in 5.6 should accelerate that, though we deliberately waited until most people already were using innodb_file_per_table. We also know that the percentage of servers that have undo logs in the shared tablespace is dropping now those can be put outside the shared tablespace. Say I could advocate such a tool or instead advocate better tablespace management. I know there are people who want to put all data for one customer on a shared host in one tablespace file, or want that per database or arbitrarily or want that and ease of moving tables between tablespaces and tables or tablespaces between servers for things like load balancing. Or we can add more improvements to data recovery for those unfortunate enough to have no or damaged backups. I know that we can do things that can reduce the cost of dropping a table outside the shared tablespace. That requires a scan of the buffer pool(s) each time but we could do that in the background and perhaps for several tables at once. We could also do things like assigning individual tables to different buffer pools to speed up the scan by skipping most of the buffer pool size, limiting it to only one pool. Those things might eliminate that reason to want tables in the shared tablespace. For myself, I choose to advocate things like those before a tool like the one you described. I think that would deliver benefits to more people and also make it easier for those who can move away from the one shared tablespace to do that. Yet I know that those choices will frustrate those who do still have data in the shared tablespace and don't want to or for some reason can't move it to another server. And of course I'm nowhere near close to being the only one who has priorities - Matt Lord in particular is the server product manager. But notice that I wrote about moving tables between tablespaces? That will leave space in the original tablespace, so what do we do about that? At some point we'll want to provide a way to free that space. And that may be adaptable as part of a similar capability for the shared tablespace. So eventually we may get to the sort of tool that you're thinking of, but applicable to a broader group of use cases. Of course none of that means I disagree with you or others who have commented here. I don't, I agree we should make this better. But I also have to choose what I think should be done first, usually on the basis of trying to benefit the greatest number or those at risk of data loss first. For your cloning problem, at least for now, what I suggest is that you: 1. Exploit the ability to put an individual InnoDB table wherever you want to mount the filesystem of the next clone on a source machine, then use ALTER TABLE to move each table in turn out of the shared tablespace on the source box to the mounted disk of the destination box. 2. Once that is done, export the tablespaces. 3. Copy across any non-InnoDB data. 4. Now create a new MySQL instance on the to be clone server and set it up with innodb_file_per_table and the undo logs outside the shared tablespace. Start it up and import the tables that are already on its disk. 5. Test. 6. Then clone in reverse from this new clone back to the source machine. This takes at least twice as much time as one clone (more random accesses) but leaves you with two machines that have the small shared tablespace and you can forget about that problem happening again for either. Eventually you can clone from these to the others as time allows. These of course are just my own views and illustrations of the sort of trade-offs that I make when deciding what to advocate. Nothing here is a commitment to us doing anything or doing it in any particular timeframe. It's just how I'm thinking and why. That's not the official Oracle view, just mine. James Day, MySQL Senior Principal Support Engineer, Oracle
[27 Sep 2014 4:56]
James Day
We've announced several new 5.7 labs releases. One of those includes broader support of tablespaces for InnoDB. As usual with a lot going on the documentation has not yet been fully updated to reflect all of the changes, expect that to take a few weeks. In addition it's likely that there will be related presentations at OpenWorld. This should further reduce the need to have data tables stored within the shared tablespace, as well as making it easier to manage sets of tables. The announcement is at http://mysqlserverteam.com/mysql-labs-releases-openworld-2014/ : "InnoDB: General Tablespaces We showcase CREATE TABLESPACE for general use. This will give the user freedom to choose the mapping between tables and tablespaces, i.e. which tablespaces to create and what tables they should contain. This allows for doing things such as grouping all tables for a given user or customer within a single tablespace, and thus having all of their data within a single file on the filesystem." Please check out the labs release as soon as practical so that we'll have a chance to incorporate feedback before the next production release of the server. As with the DMRs, strictly the presence of a feature in a labs release does not commit us to having it in a production release, so don't count on a feature being present until it does actually show up in the GA release. James Day, MySQL Senior Principal Support Engineer, Oracle
[29 Oct 2014 8:56]
Prasanth Gangaraju
I've just come across this post on the blog: http://mysqlserverteam.com/online-truncate-of-innodb-undo-tablespaces/ Does this mean this bug can be marked as resolved?
[7 Nov 2014 12:21]
James Day
Sadly no, it's not resolved by that. That improvement and the previous change in 5.6 to allow moving the undo data to different files helps but there are still good reasons for people to want to use the shared tablespace for data or, if they move the data out, to have a good way to free the space in the filesystem. The main good reason to use the shared tablespace for data today is the performance of drop table, which is worse when the data is in its own *.ibd file. So places that create and drop lots of small tables can be particularly badly affected if they don't keep data in the shared tablespace. This effect is worse when the InnoDB buffer pool is bigger because a scan of the whole buffer pool is needed. James Day, MySQL Senior Principal Support Engineer, Oracle
[8 Nov 2014 11:50]
James Day
It looks as though the drop table difference was removed in 5.6. I have some more work to do on that, possibly including a blog post to illustrate the changes through the various server versions. I'll confirm and update when I'm done. James Day, MySQL Senior Principal Support Engineer, Oracle
[21 Oct 2015 17:57]
James Day
The now-GA 5.7 provides a feature to create general tablespaces anywhere you like and store multiple tables in them, as well as move individual tables between tablespaces, including per-file tablespaces and the system tablespace. Described at https://dev.mysql.com/doc/refman/5.7/en/general-tablespaces.html . As with the shared tablespace, once space has been allocated from the filesystem there is no way to release it. If you need that capability, use per-table tablespaces instead, those do fully release the space allocated to a table when it is dropped. This provides no way to make the shared tablespace smaller but it can be used to reduce the pain of emptying it by moving tables to a different common tablespace. Combine that with putting the undo log files in a different place, as they are now by default, and with temporary tables going to their own dedicated tablespace and there will be relatively little left in the shared tablespace. Even so, the process of moving tables and doing a new install to get rid of the disk space allocated to the shared tablespace is painful for installations that do have lots of data or redo there. James Day, MySQL Senipr Principal Support Engineer, Oracle
[15 Aug 2016 8:50]
Henrik Juul Pedersen
Have you considered making the ibddata1 and similar files sparse? As far as I can see, even windows support sparse files. This would, of course, make the most sense if unused portions of the files are zeroed, but that should be possible to implement, if not already? Best regards, Henrik Juul Pedersen, Embedded Software Engineer, LIAB (Linux In A Box)
[17 Mar 2018 2:16]
James Day
Our plans for MySQL 8 in this area are described at https://mysqlserverteam.com/mysql-8-0-2-more-flexible-undo-tablespace-management/ . Briefly, we plan to move any undo space out of the shared tablespace and turn on the automatic truncating process introduced in 5.7 by default. This way a big transaction may use a lot of disc space but only temporarily. This protects currently unaffected installations against ever suffering from one of the most common big ibdata1 annoyances, one made big by undo log / rollback segment entries. This doesn't address the general ability to shrink the tablespace, though, just one of the causes of the need for that. In general, we're gradually effectively eliminating almost all traditional uses of ibdata, replacing what it has with more capable features storing their information somewhere else. James Day, MySQL Senior Principal Support Engineer, Oracle