Bug #44723 Larger read_buffer_size values can cause performance decrease for INSERTs
Submitted: 7 May 2009 15:34 Modified: 18 Sep 2009 20:08
Reporter: Harrison Fisk Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S5 (Performance)
Version:5.0/5.1 OS:Linux
Assigned to: Anurag Shekhar CPU Architecture:Any
Tags: performance, read_buffer_size

[7 May 2009 15:34] Harrison Fisk
Description:
Sizes of read_buffer_size above 256k can cause negative performance decreases when an INSERT increases above 256k in size.

This seems to be due to the fact that MyISAM will initialize an IO_CACHE for writing with read_buffer_size for the bulk insert.  my_malloc is called with the MY_ZEROFILL flag which causes memset to be called on the size of the read_buffer_size.  This results in relatively large performance decreases.

I have done some profiling on Linux and Solaris as they both suffer from it.  I will attach profiles showing the issue.

How to repeat:
1.  Start MySQL with a read_buffer_size = 1M.
2.  Run the attached ddl script.
3.  Time and run the attached ins_9.sql script.
4.  Time and run the attached ins_11.sql script.
5.  Restart MySQL with read_buffer_size = 256k.
6.  Time and run the attached ins_9.sql script.
7.  Time and run the attached ins_11.sql script.

These two test cases insert the same amount of rows, one doing 9 rows per INSERT, and the second doing 11 rows per INSERT.   The 11 row test case exceeds 256k per insert, so it seems to trigger this behavior.

If you decrease read_buffer_size to 256k, and then execute the scripts, you will see almost identical performance.

For example:

bash-3.00$ ./mysql -e "SELECT @@read_buffer_size"
+--------------------+
| @@read_buffer_size |
+--------------------+
|            1048576 |
+--------------------+
bash-3.00$ time ./mysql -u root test < ~/ins_9.sql

real    0m10.230s
user    0m1.032s
sys     0m0.191s
bash-3.00$ time ./mysql -u root test < ~/ins_11.sql

real    0m18.135s
user    0m1.020s
sys     0m0.179s

## restart with a smaller size

bash-3.00$ ./mysql -e "SELECT @@read_buffer_size"               
+--------------------+
| @@read_buffer_size |
+--------------------+
|             262144 | 
+--------------------+
bash-3.00$ time ./mysql -u root test < ~/ins_9.sql 

real    0m10.241s
user    0m1.032s
sys     0m0.192s
bash-3.00$ time ./mysql -u root test < ~/ins_11.sql 

real    0m10.059s
user    0m1.011s
sys     0m0.170s

Suggested fix:
Either do not use the IO_CACHE in cases with small amount of rows, or do not use the MY_ZEROFILL flag.
[7 May 2009 16:17] Harrison Fisk
I have uploaded the full test case to:

ftp://ftp.mysql.com/pub/mysql/download/bug_44723.tar.gz

It is ~8M, I will work on creating a stored procedure which can reproduce and will be smaller.
[7 May 2009 16:27] Harrison Fisk
Profiling information on Solaris:

11 row insert (the slow one):
                                         Inclusive        Exclusive
Function                              nsamples  pct   nsamples  pct
_thr_setup                               5884 91.15%        0  0.00%
_lwp_start                               5884 91.15%        0  0.00%
handle_one_connection                    5822 90.19%        7  0.11%
dispatch_command(enum_server_command,     5749 89.06%        8  0.12%
mysql_parse(THD*,                        5693 88.20%        4  0.06%
mysql_execute_command(THD*)              3608 55.89%        9  0.14%
mysql_insert(THD*,                       3584 55.52%       27  0.42%
mi_extra                                 2279 35.31%        4  0.06%
ha_myisam::start_bulk_insert(unsigned     2269 35.15%        3  0.05%
init_io_cache                            2260 35.01%        1  0.02%
memset                                   2253 34.90%     2253 34.90%
MYSQLparse(void*)                        1856 28.75%     1075 16.65%
fill_record_n_invoke_before_triggers(THD*,      625  9.68%        1  0.02%
(kernelmode)                              571  8.85%      571  8.85%
fill_record(THD*,                         536  8.30%      255  3.95%
MYSQLlex(void*,                           403  6.24%      299  4.63%
write_record(THD*,                        396  6.13%        3  0.05%
mi_write                                  383  5.93%       17  0.2

For comparison, here is the fast one:

9 row inserts:
                                         Inclusive        Exclusive
Function                              nsamples  pct   nsamples  pct
_thr_setup                               2771 90.94%        0  0.00%
_lwp_start                               2771 90.94%        0  0.00%
handle_one_connection                    2755 90.42%        1  0.03%
dispatch_command(enum_server_command,     2728 89.53%        3  0.10%
mysql_parse(THD*,                        2711 88.97%        1  0.03%
MYSQLparse(void*)                        1883 61.80%     1092 35.84%
mysql_execute_command(THD*)               702 23.04%        1  0.03%
mysql_insert(THD*,                        696 22.84%       11  0.36%
MYSQLlex(void*,                           402 13.19%      289  9.48%
write_record(THD*,                        306 10.04%        2  0.07%
mi_write                                  299  9.81%        5  0.16%
(kernelmode)                              276  9.06%      276  9.06%
_mi_ck_write_btree                        259  8.50%        2  0.07%
_mi_ck_real_write_btree                   257  8.43%        1  0.03%
fill_record_n_invoke_before_triggers(THD*,      256  8.40%        0  0.00%
w_search                                  252  8.27%        5  0.16%
fill_record(THD*,                         238  7.81%       28  0.92%
[7 May 2009 16:31] Harrison Fisk
Under Linux using oprofile, it seems similar.  Linux switches to using mmap'd memory for allocations over 256k which has the same code path as tlb management, so the profile is a bit odd looking, but we can see memset up near there.

11 row insert:

samples  %        image name               app name                 symbol name
66719    10.8742  vmlinux-debug-2.6.24-23-generic vmlinux-debug-2.6.24-23-generic get_page_from_freelist
63630    10.3707  vmlinux-debug-2.6.24-23-generic vmlinux-debug-2.6.24-23-generic page_fault
63436    10.3391  mysqld                   mysqld                   MYSQLparse(void*)
43761     7.1324  libc-2.7.so              libc-2.7.so              memset
43061     7.0183  vmlinux-debug-2.6.24-23-generic vmlinux-debug-2.6.24-23-generic native_flush_tlb_single
19125     3.1171  mysql                    mysql                    read_and_execute(bool)
13404     2.1846  vmlinux-debug-2.6.24-23-generic vmlinux-debug-2.6.24-23-generic handle_mm_fault
11268     1.8365  mysqld                   mysqld                   MYSQLlex(void*, void*)
8524      1.3893  libc-2.7.so              libc-2.7.so              memcpy
6772      1.1037  vmlinux-debug-2.6.24-23-generic vmlinux-debug-2.6.24-23-generic release_pages
6288      1.0248  mysql                    mysql                    my_strnncoll_simple

9 row insert:

samples  %        image name               app name                 symbol name
62788    19.1683  mysqld                   mysqld                   MYSQLparse(void*)
18592     5.6759  mysql                    mysql                    read_and_execute(bool)
13412     4.0945  reiserfs                 reiserfs                 (no symbols)
11318     3.4552  mysqld                   mysqld                   MYSQLlex(void*, void*)
7502      2.2903  libc-2.7.so              libc-2.7.so              memcpy
6197      1.8919  mysql                    mysql                    my_strnncoll_simple
5790      1.7676  mysqld                   mysqld                   bmove512
4271      1.3039  mysqld                   mysqld                   Item::Item()
4015      1.2257  libpthread-2.7.so        libpthread-2.7.so        pthread_getspecific
3694      1.1277  mysqld                   mysqld                   decimal2bin
3646      1.1131  mysqld                   mysqld                   ha_key_cmp
[7 May 2009 17:54] Harrison Fisk
I have investigated this a bit further and it could be a bigger issue than I thought originally (just performance).  

I would suggest this needs to be re-triaged, as the exact implications of this need to be determined, as wrong flags are being passed around, which might cause MY_WAIT_IF_FULL to not work correctly.

What is happening is that mi_extra is being called with HA_EXTRA_WRITE_CACHE set.  mi_extra than passes in share->write_flag & MY_WAIT_IF_FULL to the init_io_cache.

init_io_cache then takes cache_myflags and passes it into my_malloc as the flags to use.  my_malloc however uses a different set of flags.  MY_WAIT_IF_FULL is not a my_malloc flag, but it has the same define as MY_ZEROFILL (both are 32).  So my_malloc is then performing a zerofill operation which is causing the performance issue.  

The additional concern is that something is incorrect with MY_WAIT_IF_FULL and it might not be working as it should be.
[7 May 2009 17:59] MySQL Verification Team
On further analysis, it is revealed that this same bug is troubling our binary logging problems with full disk !!!

Instead of passing WAIT_IF_FULL to the my_write(), the only consequence of passng this flag to init_io_cache() is passing MY_ZEROFILL to my_malloc().

Which is why so many of our customers experience problems on full partitions.
[7 May 2009 20:52] Harrison Fisk
Shorter test case using stored procedures

Attachment: bug_44723.sql (application/octet-stream, text), 3.04 KiB.

[7 May 2009 20:56] Harrison Fisk
I have added a new smaller test case.  It uses a stored procedure to insert the rows.  To use it start MySQL with read_buffer_size = 1M and then do the following:

mysql> source ~/bug_44723.sql
Query OK, 0 rows affected (0.01 sec)

Query OK, 0 rows affected (0.01 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql> call insert_to_t1(100000,9);
Query OK, 9 rows affected (10.30 sec)

mysql> call insert_to_t1(100000,11);
Query OK, 11 rows affected (18.28 sec)

I believe I am wrong regarding the original report that the cut off of 256k for the insert size, as the inserts are dramatically smaller than that.  I'm not 100% sure why it triggers as it does for 9 vs. 11 row inserts, but it does!
[8 May 2009 16:08] MySQL Verification Team
It turns out that MY_WAIT_IF_FULL is passed correctly, after all. But, performance issue remains. Hence, I made a patch. I tested it on speed, and speed is the same regardless of the size of read_buffer_size. Actually larger read_buffer_size leads to somewhat better performance, which should be expected. I tested a patch on all our tests and all of those passed.

This is a patch:

=== modified file 'mysys/mf_iocache.c'
--- mysys/mf_iocache.c	2007-03-28 17:46:42 +0000
+++ mysys/mf_iocache.c	2009-05-08 14:40:26 +0000
@@ -235,8 +235,8 @@
 	buffer_block *= 2;
       if ((info->buffer=
 	   (byte*) my_malloc(buffer_block,
-			     MYF((cache_myflags & ~ MY_WME) |
-				 (cachesize == min_cache ? MY_WME : 0)))) != 0)
+			     MYF(((cache_myflags & ~(MY_WME | MY_WAIT_IF_FULL)) |
+				  (cachesize == min_cache ? MY_WME : 0))))) != 0)
       {
 	info->write_buffer=info->buffer;
 	if (type == SEQ_READ_APPEND)
[7 Aug 2009 12:38] Mattias Jonsson
Why it is triggered by 11 and not 9 is probably because of this in ha_myisam.cc:
  /* don't enable row cache if too few rows */
  if (! rows || (rows > MI_MIN_ROWS_TO_USE_WRITE_CACHE))
    mi_extra(file, HA_EXTRA_WRITE_CACHE, (void*) &size);

myisamdef.h:#define MI_MIN_ROWS_TO_USE_WRITE_CACHE 10
[13 Aug 2009 19:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80776

2788 Anurag Shekhar	2009-08-14
      Bug #44723 Larger read_buffer_size values can cause performance 
                 decrease for INSERTs
      
      Large number of calls to 'memset' to initialize the bugger to mulls 
      where causing the performance drop.
      my_malloc routine initializes the buffer if MY_ZEROFILL flag is set. 
      In this use case MY_ZEROFILL is set unintentionally due to presence
      MY_WAIT_IF_FULL which is not meant for my_malloc but has same value 
      as MY_ZEROFILL. Unsetting this flag before calling my_malloc solves
      the problem.
      
      The performance improvement from this patch will be visible only on 
      release build as the debug build uses a safemalloc routine which 
      initializes the buffer with ALLOC_VAL (0xA5) if MY_ZEROFILL is not set.
     @ mysys/mf_iocache.c
        Remove MY_WAIT_IF_FULL from myflags before calling my_malloc
        to prevent conflict with MY_ZEROFILL.
[21 Aug 2009 9:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/81257

2792 Anurag Shekhar	2009-08-21
      Bug #44723 Larger read_buffer_size values can cause performance 
                 decrease for INSERTs
      
      
      Bulk inserts (multiple row, CREATE ... SELECT, INSERT ... SELECT) into
      MyISAM tables were performed inefficiently. This was mainly affecting
      use cases where read_buffer_size was considerably large (>256K) and low
      number of rows was inserted (e.g. 30-100).
      
      The problem was that during I/O cache initialization (this happens
      before each bulk insert) allocated I/O buffer was unnecessarily
      initialized to '\0'.
      
      This was happening because of mess in flag values. MyISAM informs I/O
      cache to wait for free space (if out of disk space) by passing
      MY_WAIT_IF_FULL flag. Since MY_WAIT_IF_FULL and MY_ZEROFILL have the
      same values, memory allocator was initializing memory to '\0'.
      
      The performance gain provided with this patch may only be visible with
      non-debug binaries, since safemalloc always initializes allocated memory
      to 0xA5A5...
     @ mysys/mf_iocache.c
        Remove MY_WAIT_IF_FULL from myflags before calling my_malloc
        to prevent conflict with MY_ZEROFILL.
[24 Aug 2009 7:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/81377

2792 Anurag Shekhar	2009-08-24
      Bug #44723 Larger read_buffer_size values can cause performance 
               decrease for INSERTs
      
      
      Bulk inserts (multiple row, CREATE ... SELECT, INSERT ... SELECT) into
      MyISAM tables were performed inefficiently. This was mainly affecting
      use cases where read_buffer_size was considerably large (>256K) and low
      number of rows was inserted (e.g. 30-100).
      
      The problem was that during I/O cache initialization (this happens
      before each bulk insert) allocated I/O buffer was unnecessarily
      initialized to '\0'.
      
      This was happening because of mess in flag values. MyISAM informs I/O
      cache to wait for free space (if out of disk space) by passing
      MY_WAIT_IF_FULL flag. Since MY_WAIT_IF_FULL and MY_ZEROFILL have the
      same values, memory allocator was initializing memory to '\0'.
      
      The performance gain provided with this patch may only be visible with
      non-debug binaries, since safemalloc always initializes allocated memory
      to 0xA5A5...
     @ mysys/mf_iocache.c
        Remove MY_WAIT_IF_FULL from myflags before calling my_malloc
        to prevent conflict with MY_ZEROFILL.
[2 Sep 2009 10:25] Bugs System
Pushed into 5.0.86 (revid:joro@sun.com-20090902102337-n5rw8227wwp5cpx8) (version source revid:azundris@mysql.com-20090831195601-n6vjz48chxgoz9qr) (merge vers: 5.0.86) (pib:11)
[2 Sep 2009 16:41] Bugs System
Pushed into 5.1.39 (revid:joro@sun.com-20090902154533-8actmfcsjfqovgsb) (version source revid:azundris@mysql.com-20090831195422-o13kn73s2hsnk0z0) (merge vers: 5.1.39) (pib:11)
[14 Sep 2009 16:02] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (version source revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (merge vers: 5.4.4-alpha) (pib:11)
[16 Sep 2009 1:27] Paul DuBois
Noted in 5.0.86, 5.1.39, 5.4.4 changelogs.

Bulk insert performance could suffer with large read_buffer_size
values.
[18 Sep 2009 19:26] James Day
Paul, please add the performance tag and change the description to:

"For MyISAM tables with bulk_insert_buffer_size over 256k the performance of bulk insert operations like multi-row insert and insert ... select has been improved greatly when up to a hundred rows are inserted at the same time."

The purpose is to give a quick indication of the storage engine involved, the rough magnitude of the improvement and the cases that have been improved.
[18 Sep 2009 20:08] Paul DuBois
Done. Thanks for the improved wording.
[1 Oct 2009 5:58] Bugs System
Pushed into 5.1.39-ndb-6.3.28 (revid:jonas@mysql.com-20091001055605-ap2kiaarr7p40mmv) (version source revid:jonas@mysql.com-20091001055605-ap2kiaarr7p40mmv) (merge vers: 5.1.39-ndb-6.3.28) (pib:11)
[1 Oct 2009 7:25] Bugs System
Pushed into 5.1.39-ndb-7.0.9 (revid:jonas@mysql.com-20091001072547-kv17uu06hfjhgjay) (version source revid:jonas@mysql.com-20091001071652-irejtnumzbpsbgk2) (merge vers: 5.1.39-ndb-7.0.9) (pib:11)
[1 Oct 2009 13:25] Bugs System
Pushed into 5.1.39-ndb-7.1.0 (revid:jonas@mysql.com-20091001123013-g9ob2tsyctpw6zs0) (version source revid:jonas@mysql.com-20091001123013-g9ob2tsyctpw6zs0) (merge vers: 5.1.39-ndb-7.1.0) (pib:11)
[5 Oct 2009 10:50] Bugs System
Pushed into 5.1.39-ndb-6.2.19 (revid:jonas@mysql.com-20091005103850-dwij2dojwpvf5hi6) (version source revid:jonas@mysql.com-20090930185117-bhud4ek1y0hsj1nv) (merge vers: 5.1.39-ndb-6.2.19) (pib:11)
[7 Oct 2009 1:21] Paul DuBois
The 5.4 fix has been pushed into 5.4.3.
[1 Apr 2018 11:47] haochen he
mysql> select @@read_buffer_size;
+--------------------+
| @@read_buffer_size |
+--------------------+
|            1048576 | 
+--------------------+
1 row in set (0.00 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.0.85    | 
+-----------+
1 row in set (0.00 sec)

mysql> delete from t1;
Query OK, 100001 rows affected (0.00 sec)

mysql> call insert_to_t1(100000,9);
Query OK, 9 rows affected (2.05 sec)

mysql> call insert_to_t1(100000,11);
Query OK, 11 rows affected (1.98 sec)

Sorry, but i fail to reproduce in ubuntu-16.04LTS. i just want to study on it, thanks!