Bug #56347 | temp space filling up | ||
---|---|---|---|
Submitted: | 28 Aug 2010 21:07 | Modified: | 17 Dec 2015 23:44 |
Reporter: | Jamie Koc | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Documentation | Severity: | S1 (Critical) |
Version: | 5.0.27-standard, 5.0.91 | OS: | Linux |
Assigned to: | Paul DuBois | CPU Architecture: | Any |
Tags: | /var/lib/mysql/tmp, tmp space full |
[28 Aug 2010 21:07]
Jamie Koc
[29 Aug 2010 10:43]
Valeriy Kravchuk
Please, check if the same problem happens with recent 5.0.x version, 5.0.91.
[5 Sep 2010 18:17]
Jamie Koc
The same error occurs with version 5.0.91. I also noticed this same behavior with version 5.1.42-community-log using the outfile. On the server with the 5.1.42 version, I am dumping 250 million records using outfile (table size is 30G) with an select * from table order by. The server starts with 330G free and as I watch the process run, the server space drops down to 130G free. Why does MySQL need so much disk space (200G) to perform a sort on a 30G table?
[16 Sep 2010 13:18]
Susanne Ebrecht
How much indexes do you have?
[22 Sep 2010 5:40]
Jamie Koc
On the table that has 300 million records, there is one index. It takes up about 1.8G of space.
[23 Sep 2010 9:04]
Susanne Ebrecht
It needs so much space for generating the indexes and the table itself. Consider, during creating the index and or updating the index sometimes MySQL needs to copy all. You said you just have one index. I guess its the primary key or? Consider, every key also is an index. You easily can test if the problem is related to the index. Insert the data without that the table has indexes or key. Not even a primary key. After that alter table and add e.g. primary key or another index or foreign key. And watch how much space is needed.
[24 Sep 2010 8:00]
Jamie Koc
So there are two separate issues here that seem to share the same symptoms. 1. Select into outfile with an order by clause uses approximately 8 times the amount disk space as the table size itself. I have a table with 300 million records (~approx 35G of disk storage) with one index on that table (not a primary key). When I select the data into an outfile, MySQL uses almost 300G of disk space during the process. In checking the processlist, the select does a state=sorting data (which uses 300G), then space is freed, and then the process does state=send data using up disk space again. 2. In my original issue, I have a problem with doing an insert select * from table with an order by. I believe the order by is causing the disk space issue here as well. There are no indexes on the table the data is being inserted into. Why does MySQL use so much disk space? Is there a config setting that can help here?
[27 Sep 2010 13:11]
Susanne Ebrecht
Is your charset utf8?
[27 Sep 2010 17:31]
Jamie Koc
DEFAULT CHARSET=latin1
[28 Sep 2010 8:47]
Susanne Ebrecht
Please show output of: SHOW CREATE TABLE <your_table>; Also the second question I have is why are you using ORDER BY on INSERT? ORDER BY only makes sense when you want to display the data and not when you just want to use INSERT INTO SELECT. Besides, you want to auto-fill an auto-increment or so.
[28 Sep 2010 19:45]
Jamie Koc
Here is the create table statement: mysql> show create table event_2010_09_28 \G; *************************** 1. row *************************** Table: event_2010_09_28 Create Table: CREATE TABLE `event_2010_09_28` ( `file_id` int(11) NOT NULL DEFAULT '0', `id` bigint(20) NOT NULL DEFAULT '0', `cre_date` datetime DEFAULT NULL, `record_type_id` tinyint(4) NOT NULL DEFAULT '0', `customer_id` varchar(255) DEFAULT NULL, `member_id` varchar(10) DEFAULT NULL, `cid` varchar(240) DEFAULT NULL, `department_id` varchar(240) DEFAULT NULL, `member_department_id` int(11) DEFAULT NULL, `tid` varchar(245) DEFAULT NULL, `sale` decimal(9,2) DEFAULT NULL, `total` decimal(10,2) DEFAULT NULL, `zip` varchar(12) DEFAULT NULL, `free` tinyint(1) DEFAULT '0', KEY `idx_event_mid` (`member_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 For #2, the order by is just to have the data pre-sorted in the table.
[29 Sep 2010 15:43]
Jamie Koc
Here is the query that uses up 300G of space to execute SELECT customer_id,record_type_id,member_id,cre_date INTO OUTFILE 'test' FROM event_2010_09_28 WHERE customer_id IS NOT NULL AND customer_id NOT like '%-%' ORDER BY customer_id, member_id;
[8 Oct 2010 13:18]
Valeriy Kravchuk
Please, send the results of: explain SELECT customer_id,record_type_id,member_id,cre_date INTO OUTFILE 'test' FROM event_2010_09_28 WHERE customer_id IS NOT NULL AND customer_id NOT like '%-%' ORDER BY customer_id, member_id; show table status like 'event_2010_09_28'\G
[9 Oct 2010 5:56]
Jamie Koc
Explain: id: 1 select_type: SIMPLE table: event_2010_09_28 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 319451656 Extra: Using where; Using filesort Results of show table status: Name: event_2010_09_28 Engine: MyISAM Version: 10 Row_format: Dynamic Rows: 319451656 Avg_row_length: 120 Data_length: 38555182164 Max_data_length: 281474976710655 Index_length: 1815251968 Data_free: 0 Auto_increment: NULL Create_time: 2010-09-28 00:36:16 Update_time: 2010-10-01 23:38:40 Check_time: 2010-09-28 00:36:16 Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment:
[9 Oct 2010 7:23]
Valeriy Kravchuk
So, there is a filesort step and you select column defined as VARCHAR(255) among others. Full length is used in temporary files, so this gives us: mysql> select 255*319451656; +---------------+ | 255*319451656 | +---------------+ | 81460172280 | +---------------+ 1 row in set (0.00 sec) 81G at least of temporary files, even without taking other columns into account. Please, send the results of: show variables like 'char%'; from your environment.
[10 Oct 2010 6:58]
Jamie Koc
mysql> show variables like 'char%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+
[11 Oct 2010 15:59]
Jamie Koc
Does mysql create the tmp file using 255 characters for each record even though it is a varchar? The maximum length of our customer_id varchar(255) field is 64. The total size at most would be 319451656*64 = 20444905984 or 20G or is it always going to be 255*319451656 (81G)?
[15 Oct 2010 9:19]
Hartmut Holzgraefe
Exactly, filesort files, and temporary in-memory tables, allocated fixed space even for fields declared as VARCHAR or VARBINARY. For MEMORY tables this is documented on http://dev.mysql.com/doc/refman/5.1/en/memory-storage-engine.html "MEMORY tables use a fixed-length row-storage format. Variable-length types such as VARCHAR are stored using a fixed length." For filesort files i can't find a similar reference right now, but their behavior is similar. So if your customer_id VARCHAR actually only has values with up to 64 characters max. stored then you should consider reducing the VARCHAR size from 256 to the actually needed 64. Even more important though: try to avoid execution plans that require a file sort on a large data set, not only for disk space reasons but for general performance considerations. For the query given here an index on (customer_id, member_id) would avoid the file sort step all together as rows could already be fetched in the requested sorting order right away. This not only saves on temporary disk space and I/O, but also on CPU time and so should speed up that query with its large ordered result set quite a bit. Be warned though that adding an index on a large table can take some time, and will usually temporarily need up to more than twice as much disk space as the original table. The improvements from the additional index are usually making up for that one time investment pretty quick though.
[30 Apr 2011 0:27]
Mark Callaghan
If this is not a bug, then the behavior for on-disk temp tables and filesort files should be documented. Can you provide a reference to that?
[30 Apr 2011 8:50]
Sveta Smirnova
Mark, thank you for the feedback. You are right: there is no info about how information stored in temporary files, so moving bug to "Documentation".
[17 Dec 2015 23:44]
Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly. Information about storage format of temporary tables can be found here: http://dev.mysql.com/doc/refman/5.7/en/internal-temporary-tables.html (Note that MySQL is more efficient about storage space for temp tables now (as of 5.6) because dynamic-length row storage format is used rather than fixed-length row format.) Information about how MySQL writes information to files used for the filesort algorithm can be found here: http://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html Again, more recent versions of MySQL (as of 5.7) use more space-efficient storage in many cases. The increased efficiency with respect to disk space may alleviate the problem of running out of disk in some or many cases.