Bug #62373 multi-step unlink file to reduce IO load when deleting big innodb data file
Submitted: 7 Sep 2011 12:27 Modified: 7 Sep 2011 12:36
Reporter: xiaobin lin (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB Plugin storage engine Severity:S4 (Feature request)
Version:5.1 OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: drop table, innodb_file_per_table, truncate, unlink

[7 Sep 2011 12:27] xiaobin lin
Description:
  In the procedure of "truncate table" or "drop table", InnoDB use "unlink(xx.idb)" directly.
  If the table is huge (such as >200G), during the peroid of deleting the data file, the IO is run out, and the OLTP service is affected. 

   So if MySQL can provide parameters for users to control the IO consumption, it can reduce the  risk when drop huge table. It will make the operation works longer, but safer.
 
   Followed is the patch for 5.1.48. Two new varibles are provided to control.

How to repeat:
truncate or drop  huge table

Suggested fix:
diff -Nur mysql-5.1.48.raw//storage/innodb_plugin/handler/ha_innodb.cc mysql-5.1.48/storage/innodb_plugin/handler/ha_innodb.cc
--- mysql-5.1.48.raw//storage/innodb_plugin/handler/ha_innodb.cc	2010-06-03 23:50:27.000000000 +0800
+++ mysql-5.1.48/storage/innodb_plugin/handler/ha_innodb.cc	2011-09-07 19:21:30.742212001 +0800
@@ -10886,6 +10886,17 @@
   "trigger a readahead.",
   NULL, NULL, 56, 0, 64, 0);
 
+static MYSQL_SYSVAR_ULONG(step_delete_interval, srv_step_delete_interval,
+  PLUGIN_VAR_RQCMDARG,
+  "Number of microseconds to sleep between every step of deleting big file."
+  "If set zero, unlink file directly",
+  NULL, NULL, 0, 0, 1000000, 0);
+
+static MYSQL_SYSVAR_ULONG(step_delete_size, srv_step_delete_size,
+  PLUGIN_VAR_RQCMDARG,
+  "Number of MB to truncate when delete big file by step ",
+  NULL, NULL, 500, 0, 100000, 0);
+
 static struct st_mysql_sys_var* innobase_system_variables[]= {
   MYSQL_SYSVAR(additional_mem_pool_size),
   MYSQL_SYSVAR(autoextend_increment),
@@ -10941,6 +10952,8 @@
   MYSQL_SYSVAR(use_sys_malloc),
   MYSQL_SYSVAR(change_buffering),
   MYSQL_SYSVAR(read_ahead_threshold),
+  MYSQL_SYSVAR(step_delete_interval),
+  MYSQL_SYSVAR(step_delete_size),
   MYSQL_SYSVAR(io_capacity),
   NULL
 };
diff -Nur mysql-5.1.48.raw//storage/innodb_plugin/include/srv0srv.h mysql-5.1.48/storage/innodb_plugin/include/srv0srv.h
--- mysql-5.1.48.raw//storage/innodb_plugin/include/srv0srv.h	2010-06-03 23:49:59.000000000 +0800
+++ mysql-5.1.48/storage/innodb_plugin/include/srv0srv.h	2011-09-07 19:18:32.646218912 +0800
@@ -144,6 +144,8 @@
 
 extern ulint	srv_n_file_io_threads;
 extern ulong	srv_read_ahead_threshold;
+extern ulong	srv_step_delete_interval;
+extern ulong	srv_step_delete_size;
 extern ulint	srv_n_read_io_threads;
 extern ulint	srv_n_write_io_threads;
 
diff -Nur mysql-5.1.48.raw//storage/innodb_plugin/os/os0file.c mysql-5.1.48/storage/innodb_plugin/os/os0file.c
--- mysql-5.1.48.raw//storage/innodb_plugin/os/os0file.c	2010-06-03 23:49:59.000000000 +0800
+++ mysql-5.1.48/storage/innodb_plugin/os/os0file.c	2011-09-07 19:17:33.186218537 +0800
@@ -1547,6 +1547,32 @@
 #endif
 }
 
+UNIV_INTERN
+int
+step_unlink(const char* name)
+{
+	int ret = 0;
+	FILE *file = fopen(name, "r+");
+	if (!file) return -1;
+	
+	ret = fseek(file, 0L, SEEK_END);
+	if (ret) return -1;
+	
+	while(fseek(file, -srv_step_delete_size*1048576, SEEK_CUR) != -1)
+	{
+		if(!os_file_set_eof(file))
+		{
+			fclose(name);
+			return -1;
+		}
+		os_thread_sleep(srv_step_delete_interval);
+	}
+
+	fclose(file);
+
+	return unlink(name);
+}
+
 /***********************************************************************//**
 Deletes a file. The file has to be closed before calling this.
 @return	TRUE if success */
@@ -1598,7 +1624,12 @@
 #else
 	int	ret;
 
-	ret = unlink(name);
+	if (srv_step_delete_interval == 0) {
+		ret = unlink(name);
+	}
+	else {
+		ret = step_unlink(name);
+	}
 
 	if (ret != 0) {
 		os_file_handle_error_no_exit(name, "delete");
diff -Nur mysql-5.1.48.raw//storage/innodb_plugin/srv/srv0srv.c mysql-5.1.48/storage/innodb_plugin/srv/srv0srv.c
--- mysql-5.1.48.raw//storage/innodb_plugin/srv/srv0srv.c	2010-06-03 23:50:08.000000000 +0800
+++ mysql-5.1.48/storage/innodb_plugin/srv/srv0srv.c	2011-09-07 19:20:29.202207395 +0800
@@ -208,6 +208,9 @@
 readahead request. */
 UNIV_INTERN ulong	srv_read_ahead_threshold	= 56;
 
+UNIV_INTERN ulong	srv_step_delete_interval	= 0;
+UNIV_INTERN ulong	srv_step_delete_size		= 500;
+
 #ifdef UNIV_LOG_ARCHIVE
 UNIV_INTERN ibool		srv_log_archive_on	= FALSE;
 UNIV_INTERN ibool		srv_archive_recovery	= 0;
[7 Sep 2011 12:36] Marko Mäkelä
Thank you for the patch! It never occurred to me that this could be a problem.

I reviewed the function fsp_try_extend_data_file() and have a possible explanation of the unlink() slowness. InnoDB appears to extend *.ibd files 4 megabytes at a time (or 1 megabyte at a time up to 32 megabytes). As the file gets extended in such small steps, it could easily become fragmented.