Bug #52455 Subpar INSERT ON DUPLICATE KEY UPDATE performance with many partitions
Submitted: 30 Mar 2010 1:54 Modified: 14 Oct 2010 12:59
Reporter: Peter Zaitsev (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Partitions Severity:S3 (Non-critical)
Version:5.1 OS:Any
Assigned to: Mattias Jonsson CPU Architecture:Any
Tags: partitioning, performance, qc

[30 Mar 2010 1:54] Peter Zaitsev
Description:
There seems to be an issue running UPDATE part in INSERT ON DUPLICATE KEY UPDATE statement with large number of partitions. Consider empty table:

[root@centos admob]# time mysql test < test.sql

real    0m0.197s
user    0m0.019s
sys     0m0.004s
[root@centos admob]# time mysql test < test.sql

real    0m12.293s
user    0m0.031s
sys     0m0.005s

As you can see running SQL statement first time (inserting all rows) is fast, while running it second time (updating all rows) is about 50 times slower.

file attached.

How to repeat:
See attached files
[30 Mar 2010 1:55] Peter Zaitsev
table create

Attachment: create.sql (application/octet-stream, text), 59.84 KiB.

[30 Mar 2010 2:47] MySQL Verification Team
Thank you for the bug report.

Microsoft Windows [Version 6.0.6002]
Copyright (c) 2006 Microsoft Corporation.  All rights reserved.

C:\DBS>51

C:\DBS>c:\dbs\5.1\bin\mysql -uroot --port=3306 --prompt="mysql 5.1 >"
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.46-Win X64-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.1 >create database d6;
Query OK, 1 row affected (0.02 sec)

mysql 5.1 >use d6
Database changed
mysql 5.1 >source c:/bugs/create.sql
Query OK, 0 rows affected (49.29 sec)

mysql 5.1 >source c:/bugs/test.sql
Query OK, 36893 rows affected (1.00 sec)
Records: 36893  Duplicates: 0  Warnings: 0

mysql 5.1 >source c:/bugs/test.sql
Query OK, 73786 rows affected (33.43 sec)
Records: 36893  Duplicates: 36893  Warnings: 0

mysql 5.1 >
[31 Mar 2010 6:57] Yasufumi Kinoshita
I think prune_partitions() should be called from mysql_insert() for the each records.
(as mysql_update(), mysql_delete() and JOIN::optimize() do)

ha_innobase::change_active_index() is called for all partitions for every records.

It seems to be most wasteful for the performance.
[2 Apr 2010 16:56] Omer Barnir
Why is the comparison done between inserts (to an empty table) and updates?
Shouldn't the comparison be between an insert, update of a partitioned table and one that is not?
[2 Apr 2010 18:58] Peter Zaitsev
Omer,

Insert on duplicate key update has 2 execution paths one is Insert and other is Update.  The first execution of this statement every record is inserted,  the second execution all are duplicates and everything trigger updates.

I am not expecting the performance to be the same but speed of insert should be somethat similar to speed of update.   You can confirm by running this test with no partitions - first and repeated execution will be taking similar times.
[5 Apr 2010 14:28] MySQL Verification Team
Same test without partition:

C:\DBS>c:\dbs\5.1\bin\mysql -uroot --port=3306 --prompt="mysql 5.1 >"
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.46-Win X64-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.1 >use d6
Database changed
mysql 5.1 >drop table manypart;
Query OK, 0 rows affected (47.81 sec)

mysql 5.1 >CREATE TABLE `manypart` (
    ->   `universal_date_key` int(4) unsigned NOT NULL DEFAULT '0',
    ->   `k` int(10) unsigned NOT NULL,
    ->   `pad` int(10) unsigned NOT NULL,
    ->   PRIMARY KEY (`universal_date_key`,`k`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 PACK_KEYS=1;
Query OK, 0 rows affected (0.15 sec)

mysql 5.1 >source c:/bugs/test.sql
Query OK, 36893 rows affected (0.51 sec)
Records: 36893  Duplicates: 0  Warnings: 0

mysql 5.1 >source c:/bugs/test.sql
Query OK, 73786 rows affected (0.97 sec)
Records: 36893  Duplicates: 36893  Warnings: 0

mysql 5.1 >
[12 Apr 2010 17:43] Mattias Jonsson
I was asked by two different attendants at the conference about this bug, so I assigned myself...
[12 Apr 2010 23:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/105447

3452 Mattias Jonsson	2010-04-12
      Bug#52455: Subpar INSERT ON DUPLICATE KEY UPDATE performance
                 with many partitions
      
      Problem was that index_init/index_end was called for all
      partitions for each and every row.
      
      Solution was to use the partition that was used for insert
      directly, instead of using the default implementation
     @ sql/ha_partition.cc
        Bug#52455: Subpar INSERT ON DUPLICATE KEY UPDATE performance
                   with many partitions
        
        Optimization for the case INSERT .. ON DUPLICATE KEY UPDATE,
        if not INSERT, fall back on default implementation.
     @ sql/ha_partition.h
        Bug#52455: Subpar INSERT ON DUPLICATE KEY UPDATE performance
                   with many partitions
        
        Added the class function index_read_idx_map to handle the
        special case for INSERT .. ON DUPLICATE KEY UPDATE.
[12 Apr 2010 23:24] Mattias Jonsson
Proposed a patch that solves this specific problem, mostly as a quick fix to verify the problem. Might rework the patch...
[22 Apr 2010 21:24] bhushan uparkar
Hello Mattias,

I was trying to apply the patch, but running into following problem, Could you please comment . Also one more question I dont see the the new function called from anywhere , except within itself. Is that correct ? 

{{{
libtool: link: g++ -O3 -DBIG_JOINS=1 -felide-constructors -fno-exceptions -fno-rtti -fno-implicit-templates -fno-exceptions -fno-rtti -rdynamic -o mysqld sql_lex.o sql_handler.o sql_partition.o item.o item_sum.o item_buff.o item_func.o item_cmpfunc.o item_strfunc.o item_timefunc.o thr_malloc.o item_create.o item_subselect.o item_row.o item_geofunc.o item_xmlfunc.o field.o strfunc.o key.o sql_class.o sql_list.o net_serv.o protocol.o sql_state.o lock.o my_lock.o sql_string.o sql_manager.o sql_map.o mysqld.o password.o hash_filo.o hostname.o sql_connect.o scheduler.o sql_parse.o set_var.o sql_yacc.o sql_base.o table.o sql_select.o sql_insert.o sql_profile.o sql_prepare.o sql_error.o sql_locale.o sql_update.o sql_delete.o uniques.o sql_do.o procedure.o sql_test.o log.o init.o derror.o sql_acl.o unireg.o des_key_file.o log_event.o rpl_record.o log_event_old.o rpl_record_old.o discover.o time.o opt_range.o opt_sum.o records.o filesort.o handler.o ha_partition.o debug_sync.o sql_db.o sql_table.o sql_rename.o sql_crypt.o sql_load.o mf_iocache.o field_conv.o sql_show.o sql_udf.o sql_analyse.o sql_cache.o slave.o sql_repl.o rpl_filter.o rpl_tblmap.o rpl_utility.o rpl_injector.o rpl_rli.o rpl_mi.o rpl_reporting.o sql_union.o sql_derived.o sql_client.o repl_failsafe.o sql_olap.o sql_view.o gstream.o spatial.o sql_help.o sql_cursor.o tztime.o my_decimal.o sp_head.o sp_pcontext.o sp_rcontext.o sp.o sp_cache.o parse_file.o sql_trigger.o event_scheduler.o event_data_objects.o event_queue.o event_db_repository.o events.o sql_plugin.o sql_binlog.o sql_builtin.o sql_tablespace.o partition_info.o sql_servers.o event_parse_data.o mini_client_errors.o pack.o client.o my_time.o my_user.o  ./.libs/libndb.a -lpthread -lpthread -lpthread -lpthread ../storage/archive/libarchive.a ../storage/blackhole/libblackhole.a ../storage/csv/libcsv.a ../storage/federated/libfederated.a ../storage/heap/libheap.a ../storage/innobase/libinnobase.a ../storage/myisam/libmyisam.a ../storage/myisammrg/libmyisammrg.a ../vio/libvio.a ../mysys/libmysys.a ../dbug/libdbug.a ../regex/libregex.a ../strings/libmystrings.a -lz -lwrap -ldl ../extra/yassl/src/.libs/libyassl.a -lpthread -lpthread -lpthread -lpthread ../extra/yassl/taocrypt/src/.libs/libtaocrypt.a -lpthread -lpthread -lpthread -lpthread -lpthread -lcrypt -lnsl -lm -lpthread
../extra/yassl/src/.libs/libyassl.a(ssl.o): In function `yaSSL::read_file(yaSSL::SSL_CTX*, char const*, int, yaSSL::CertType)':
ssl.cpp:(.text+0x20d2): warning: memset used with constant zero length parameter; this could be due to transposed parameters
ha_partition.o:(.data.rel.ro._ZTV12ha_partition[vtable for ha_partition]+0xc8): undefined reference to `ha_partition::index_read_idx_map(unsigned char*, unsigned int, unsigned char const*, unsigned long, ha_rkey_function)'
collect2: ld returned 1 exit status
make[4]: *** [mysqld] Error 1
make[4]: Leaving directory `/var/src/mysql5.1.44/mysql-dfsg-5.1-5.1.44/builddir/sql'
make[3]: *** [all-recursive] Error 1
make[3]: Leaving directory `/var/src/mysql5.1.44/mysql-dfsg-5.1-5.1.44/builddir/sql'
make[2]: *** [all] Error 2
make[2]: Leaving directory `/var/src/mysql5.1.44/mysql-dfsg-5.1-5.1.44/builddir/sql'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/var/src/mysql5.1.44/mysql-dfsg-5.1-5.1.44/builddir'
make: *** [build-stamp] Error 2
dpkg-buildpackage: failure: debian/rules build gave error exit status 2
debuild: fatal error at line 1319:
dpkg-buildpackage -rfakeroot -D -us -uc failed
bhushan@beach110.sc9.admob.com.:mysql-dfsg-5.1-5.1.44$ 
Display all 2191 possibilities? (y or n)
bhushan@beach110.sc9.admob.com.:mysql-dfsg-5.1-5.1.44$ g++ --version
g++-4.3.real (Debian 4.3.2-1.1) 4.3.2
Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

}}}
[29 Apr 2010 11:56] Mattias Jonsson
also related to bug#52517
[8 Jul 2010 14:27] Mattias Jonsson
After discussing with Mikael, our conclusion is that ha_partition::index_read_idx_map should be implemented by using get_partition_set and loop through the matching partitions (aka dynamic partition pruning).
[8 Jul 2010 23:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/113190

3470 Mattias Jonsson	2010-07-09
      Bug#52455: Subpar INSERT ON DUPLICATE KEY UPDATE performance with many partitions
      
      The handler function for reading one row from a specific index
      was not optimized in the partitioning handler since it
      used the default implementation.
      
      No test case since it is performance only, verified by hand.
     @ sql/ha_partition.cc
        Implemented a optimized version of index_read_idx_map
        for the case when find flag == HA_READ_KEY_EXACT,
        which is the common case.
     @ sql/ha_partition.h
        Declared ha_partition::index_read_idx_map
[8 Jul 2010 23:18] Mattias Jonsson
simple benchmarking of test operations.

              Fixed            Broken
              parts  no parts  parts  no parts
first insert   2.33      1.16   2.55      1.20 (no updates)
Second insert  2.81      2.44  *60.6*     2.49 (all updates)
update WHERE   2.90      1.32   2.88      1.33 (same parse, all updates)
update         2.45      0.95   2.47      0.95 (no parse, all updates)
insert select  4.53      1.13   4.77      1.14 (no updates)
insert select  4.23      2.45  *58.4*     2.52 (all updates)
insert select  5.64      2.38  *61.1*     2.35 (same table, all updates)
first replace  2.77      1.39   1.82      1.36 (no replace)
second replace 3.11      2.32  *61.6*     2.41 (all replace)
replace select 2.66      1.31   2.82      1.25 (no replace)
replace select 4.09      3.84  *57.7*     4.07 (all replace)
replace select 4.15      2.43  *63.0*     2.42 (same table, all replace)
[9 Jul 2010 13:16] Mattias Jonsson
pushed to mysql-5.1-bugteam and mysql-trunk-merge
[23 Jul 2010 12:28] Bugs System
Pushed into mysql-trunk 5.5.6-m3 (revid:alik@sun.com-20100723121820-jryu2fuw3pc53q9w) (version source revid:vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1) (merge vers: 5.5.5-m3) (pib:18)
[23 Jul 2010 12:35] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100723121929-90e9zemk3jkr2ocy) (version source revid:vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1) (pib:18)
[28 Jul 2010 14:11] Jon Stephens
Documented bugfix in the 5.1.50 and 5.5.6 changelogs as follows:

        INSERT ON DUPLICATE KEY UPDATE statements performed poorly on
        tables having many partitions. This was because the handler
        function for reading a row from a specific index was not
        optimized in the partitioning handler.

Closed.
[19 Aug 2010 15:42] Bugs System
Pushed into mysql-5.1 5.1.51 (revid:build@mysql.com-20100819151858-muaaor6jojb5ouzj) (version source revid:build@mysql.com-20100819151858-muaaor6jojb5ouzj) (merge vers: 5.1.51) (pib:20)
[14 Oct 2010 8:28] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.20 (revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (version source revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (merge vers: 5.1.51-ndb-7.0.20) (pib:21)
[14 Oct 2010 8:43] Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.39 (revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (version source revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (merge vers: 5.1.51-ndb-6.3.39) (pib:21)
[14 Oct 2010 8:58] Bugs System
Pushed into mysql-5.1-telco-6.2 5.1.51-ndb-6.2.19 (revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (version source revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (merge vers: 5.1.51-ndb-6.2.19) (pib:21)
[14 Oct 2010 12:59] Jon Stephens
Already documented in 5.1.50; no new changelog entries required. Setting back to Closed.