Description:
Scenario:
- Innodb is running with --innodb-file-per-table option
- We have a large number of empty tables with the following structure: create table t1(k int not null primary key, val blob)
- Each blob is 1K long (anything that is comparable to the page size will show the problem, probably the best is around 8 K).
- The OS fsync() call has a 10 ms overhead regardless of the amount of the changed data (I tested on CentOS 5.5 with EXT3 over a RAID 1+0 on HP P410i disk array)
- each table is being populated with a small number of rows - the more tables and the smaller the number of rows, the more it hurts, but it did hurt at 10 K rows in my tests with the slowdown of 50% compared to the patched version
In this scenario if we run GDB profile on the running mysqld process we will see the bottleneck of performance in fsync() calls when the tablespace is being extended via fil_extend_space_to_desired_size()
How to repeat:
Set up the above scenario.
Suggested fix:
I do not know if this is the best way to solve the problem, or if this patch has any bad side effects, but I did get good results with it in testing, and nothing misbehaved obviously:
--- /home/sasha/mysql-5.5.8/storage/innobase//fsp/fsp0fsp.c 2010-12-03 09:58:26.000000000 -0800
+++ storage/innobase/fsp/fsp0fsp.c 2011-03-21 14:15:40.907502250 -0700
@@ -1179,7 +1179,7 @@
ut_a(page_no >= size);
success = fil_extend_space_to_desired_size(&actual_size, space,
- page_no + 1);
+ page_no + srv_tablespace_extra_extend + 1);
/* actual_size now has the space size in pages; it may be less than
we wanted if we ran out of disk space */
--- /home/sasha/mysql-5.5.8/storage/innobase//handler/ha_innodb.cc 2010-12-03 09:58:26.00000000
0 -0800
+++ storage/innobase/handler/ha_innodb.cc 2011-03-18 17:19:11.655353157 -0700
@@ -11193,6 +11201,13 @@
"Number of pages that must be accessed sequentially for InnoDB to "
"trigger a readahead.",
NULL, NULL, 56, 0, 64, 0);
+
+static MYSQL_SYSVAR_ULONG(tablespace_extra_extend,
+ srv_tablespace_extra_extend,
+ PLUGIN_VAR_RQCMDARG,
+ "Number of extra pages added to tablespace when it needs "
+ "to grow",
+ NULL, NULL, 0, 0, LONG_MAX, 0);
static struct st_mysql_sys_var* innobase_system_variables[]= {
MYSQL_SYSVAR(additional_mem_pool_size),
@@ -11255,6 +11278,7 @@
MYSQL_SYSVAR(change_buffering_debug),
#endif /* UNIV_DEBUG || UNIV_IBUF_DEBUG */
MYSQL_SYSVAR(read_ahead_threshold),
+ MYSQL_SYSVAR(tablespace_extra_extend),
MYSQL_SYSVAR(io_capacity),
MYSQL_SYSVAR(purge_threads),
MYSQL_SYSVAR(purge_batch_size),
--- /home/sasha/mysql-5.5.8/storage/innobase//include/srv0srv.h 2010-12-03 09:58:26.000000000 -0800
+++ storage/innobase/include/srv0srv.h 2011-03-18 17:14:09.587353429 -0700
@@ -164,6 +164,7 @@
extern ulong srv_read_ahead_threshold;
extern ulint srv_n_read_io_threads;
extern ulint srv_n_write_io_threads;
+extern ulint srv_tablespace_extra_extend;
/* Number of IO operations per second the server can do */
extern ulong srv_io_capacity;
--- /home/sasha/mysql-5.5.8/storage/innobase//srv/srv0srv.c 2010-12-03 09:58:26.000000000 -0800
+++ storage/innobase/srv/srv0srv.c 2011-03-21 09:59:55.574071815 -0700
@@ -231,6 +231,7 @@
in the buffer cache and accessed sequentially for InnoDB to trigger a
readahead request. */
UNIV_INTERN ulong srv_read_ahead_threshold = 56;
+ulint srv_tablespace_extra_extend = 0;
#ifdef UNIV_LOG_ARCHIVE
UNIV_INTERN ibool srv_log_archive_on = FALSE;
The patch introduces a new system variable innodb-tablespace-extra-extend which specified how many extra pages to pad the tablespace with should we need to expand it via fsp_try_extend_data_file_with_pages(). If it is set to 0 the behavior is the same as the original code.
In my benchmarks I saw a reduced number of fsync() calls and improved performance by a factor of 2.5 for tables populated with 1K rows, and by 50% for 10 K rows when innodb-tablespace-extra-extend was set to 128 compared to 0 (original behavior).