Bug #73648 | innodb table replication is very slow with some of the partioned table | ||
---|---|---|---|
Submitted: | 19 Aug 2014 22:10 | Modified: | 9 Feb 2018 9:05 |
Reporter: | chandra nanjegowda | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Row Based Replication ( RBR ) | Severity: | S2 (Serious) |
Version: | Server version: 5.6.16-log MySQL Communi, 5.6.39 | OS: | Linux |
Assigned to: | CPU Architecture: | Any | |
Tags: | replication, slave, slow |
[19 Aug 2014 22:10]
chandra nanjegowda
[18 Feb 2015 1:17]
Mike Griffin
I am able to reproduce this in MySQL 5.6 using a pretty large table with many partitions. I will attach a self documenting tar file with a test case. The instructions are in the file step00_instructions.txt within the tar You can see my results in zz_results.txt, which I will also paste here: # cat zz_results.txt System where all six instances where running: [root@mg8 ~]# grep -m1 MHz /proc/cpuinfo cpu MHz : 2600.030 [root@mg8 ~]# grep -c MHz /proc/cpuinfo 8 [root@mg8 ~]# free -m total used free shared buffers cached Mem: 8033 7988 45 0 19 2317 -/+ buffers/cache: 5652 2381 Swap: 0 0 0 Relvant settings in my.cnf: innodb-buffer-pool-size = 768M innodb-log-file-size = 256M My results: +-------+------------+----------+-------------------+ | slave | ver | lag | num_of_partitions | +-------+------------+----------+-------------------+ | slave | 5.1.73-log | 00:00:14 | 370 | +-------+------------+----------+-------------------+ +-------+------------+----------+-------------------+ | slave | ver | lag | num_of_partitions | +-------+------------+----------+-------------------+ | slave | 5.5.41-log | 00:00:16 | 370 | +-------+------------+----------+-------------------+ +-------+------------+----------+-------------------+ | slave | ver | lag | num_of_partitions | +-------+------------+----------+-------------------+ | slave | 5.6.22-log | 00:05:55 | 370 | +-------+------------+----------+-------------------+ +-------+------------+----------+-------------------+ | slave | ver | lag | num_of_partitions | +-------+------------+----------+-------------------+ | slave | 5.6.22-log | 00:00:16 | 1 | +-------+------------+----------+-------------------+
[18 Feb 2015 1:22]
Mike Griffin
My test file is now available in your SFTP server @ /support/incoming/mysql-bug-data-73648.tar
[18 Feb 2015 19:58]
Mike Griffin
I wanted to mention that adding additional empty partitions results in 5.6 taking even longer for the same UPDATE to replicate, which perhaps gives a clue about the performance problem to the developers. It is also worth noting that this may be fixed in 5.7 with InnoDB Native Partitioning[0]: +-------+------------------------+----------+-------------------+ | slave | ver | lag | num_of_partitions | +-------+------------------------+----------+-------------------+ | slave | 5.7.5-labs-preview-log | 00:00:44 | 370 | +-------+------------------------+----------+-------------------+ The 5.7 release side by side with the 5.6 release for a single partition: mysql[sandbox]> select 'slave', @@version ver, timediff(master_end,master_start) masterexec, timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +-------+------------+------------+----------+-------------------+ | slave | ver | masterexec | lag | num_of_partitions | +-------+------------+------------+----------+-------------------+ | slave | 5.6.22-log | 00:00:18 | 00:00:16 | 1 | +-------+------------+------------+----------+-------------------+ 1 row in set (0.00 sec) mysql[sandbox]> select 'slave', @@version ver, timediff(master_end,master_start) masterexec, timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +-------+------------------------+------------+----------+-------------------+ | slave | ver | masterexec | lag | num_of_partitions | +-------+------------------------+------------+----------+-------------------+ | slave | 5.7.5-labs-preview-log | 00:00:53 | 00:00:55 | 1 | +-------+------------------------+------------+----------+-------------------+ 1 row in set (0.00 sec) *[0] http://mysqlserverteam.com/innodb-native-partitioning-early-access/
[8 Apr 2015 1:18]
Mike Griffin
This performance regression for partitioned tables is still present in the latest release: Server version: 5.6.24-log MySQL Community Server (GPL) Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql[sandbox]> select * from hb; +----+---------------------+---------------------+---------------------+ | id | master_start | master_end | slave_end | +----+---------------------+---------------------+---------------------+ | 1 | 2015-04-08 00:59:20 | 2015-04-08 00:59:43 | 2015-04-08 01:05:38 | +----+---------------------+---------------------+---------------------+ 1 row in set (0.00 sec) Removing partitioning, downgrading to 5.5, or converting the table to MyISAM solves the problem.
[6 Feb 2018 21:09]
Arnaud Adant
Yes it looks that this bug was never verified not addressed in 5.6. In 5.7. it is fixed with native partitioning. Here is a test case : --source include/have_binlog_format_row.inc --source include/have_partition.inc --source include/master-slave.inc --source include/have_innodb.inc CREATE TABLE t ( id bigint(20) unsigned NOT NULL auto_increment, d date NOT NULL, ts datetime(6) NOT NULL, PRIMARY KEY (id,d) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 /*!50500 PARTITION BY HASH(id) partitions 200*/; insert into t(d,ts) values(current_date(),now()); insert into t(d,ts) values(current_date(),now()); insert into t(d,ts) values(current_date(),now()); insert into t(d,ts) values(current_date(),now()); replace into t(d,ts) select t1.d,t1.ts from t t1, t t2, t t3, t t4, t t5, t t6, t t7, t t8, t t9, t t10; commit; update t set ts = now(); commit; drop table t; --sync_slave_with_master stop slave;
[6 Feb 2018 21:11]
Arnaud Adant
A workaround is to set binlog_format=statement before executing the update.
[7 Feb 2018 1:12]
Mike Griffin
Results are quite a bit better in 5.7 than was seen in 5.6, using my previously supplied test case: mysql> select 'slave', @@version ver, timediff(master_end,master_start) masterexec, timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +-------+--------+------------+----------+-------------------+ | slave | ver | masterexec | lag | num_of_partitions | +-------+--------+------------+----------+-------------------+ | slave | 5.7.21 | 00:00:24 | 00:00:52 | 370 | +-------+--------+------------+----------+-------------------+ 1 row in set (0.03 sec) mysql> select 'slave', @@version ver, timediff(master_end,master_start) masterexec, timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +-------+--------+------------+----------+-------------------+ | slave | ver | masterexec | lag | num_of_partitions | +-------+--------+------------+----------+-------------------+ | slave | 5.7.21 | 00:00:18 | 00:00:21 | 1 | +-------+--------+------------+----------+-------------------+ 1 row in set (0.00 sec) mysql> select 'master', @@version ver, timediff(master_end,master_start) lag , (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +--------+------------+----------+-------------------+ | master | ver | lag | num_of_partitions | +--------+------------+----------+-------------------+ | master | 5.7.21-log | 00:00:24 | 370 | +--------+------------+----------+-------------------+ 1 row in set (0.04 sec) mysql> select 'slave', @@version ver,timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +-------+--------+----------+-------------------+ | slave | ver | lag | num_of_partitions | +-------+--------+----------+-------------------+ | slave | 5.7.21 | 00:00:52 | 370 | +-------+--------+----------+-------------------+ 1 row in set (0.00 sec) mysql> use test_73648 Database changed mysql> select * from hb; +----+---------------------+---------------------+---------------------+ | id | master_start | master_end | slave_end | +----+---------------------+---------------------+---------------------+ | 1 | 2018-02-07 01:00:00 | 2018-02-07 01:00:24 | 2018-02-07 01:01:16 | +----+---------------------+---------------------+---------------------+ 1 row in set (0.01 sec) So with partitions is still about 2x slower in 5.7. In 5.6 it was about 24x slower with partitions. In 5.1/5.5, partitioning did not seem to affect replication performance Note that the config/hardware of the 5.7 test may be slightly different than the previous samples, but the masterexec vs lag is what would matter here
[7 Feb 2018 1:21]
Mike Griffin
Displayed more clearly: mysql> select timediff(master_end,master_start) as `master`, timediff(slave_end,master_end) as `slave` from hb; +----------+----------+ | master | slave | +----------+----------+ | 00:00:19 | 00:00:46 | +----------+----------+ 1 row in set (0.01 sec)
[7 Feb 2018 10:12]
Arnaud Adant
>In 5.1/5.5, partitioning did not seem to affect replication performance Are you sure you were using binlog_format=row in 5.1 and 5.5 ? Using MTR and the test I pasted previously, I get this : 5.5.59 : bug.test [ pass ] 33140 bug.test_no_partition [ pass ] 15328 5.6.39 bug.test [ pass ] 158594 <== bug bug.test_no_partition [ pass ] 20350 5.7.21 bug.test [ pass ] 37490 bug.test_no_partition [ pass ] 26441 8.0.4 rc bug.test [ pass ] 48723 bug.test_no_partition [ pass ] 40933 There is always an overhead with partitioning, especially large number of partitions. Also there is a decrease in performance for single thread over time. http://smalldatum.blogspot.co.uk/2017/07/i-overstated-cpu-regression-in-mysql-57.html
[8 Feb 2018 17:07]
Ravi Sastry Karra
#While performing the data/table replication between two servers (master-slave), #tried importing the slave dump .sql data into a table (dump copied from #master). Now in slave, #System details: mysql -V mysql Ver 14.14 Distrib 5.5.54, for Linux (x86_64) using readline 5.1 #Command line: mysql -u -root -p NokiaLTE_Performace_T < NokiaLTE_Performace_T.sql ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'SSH-2.0-OpenSSH_5.3' at line 1 #what is the fix for this error?
[9 Feb 2018 9:05]
MySQL Verification Team
Thank you for the report and feedback. Observed that 5.6.39 is affected(partitioned table + RBR). Thanks, Umesh
[9 Feb 2018 9:06]
MySQL Verification Team
test results
Attachment: 73648.results (application/octet-stream, text), 10.60 KiB.
[27 Jun 2018 8:38]
Arnaud Adant
This bug is most likely fixed InnoDB: Replication lag occurred on slave instances during large update operations on tables with many partitions. (Bug #25687813, Bug #85352) see release notes for 5.6.38)
[6 Jul 2018 3:09]
Mike Griffin
This seems to have a positive effect on the slave and negative on the master. The performance of the MySQL 5.6 slave is far better, using my originally submitted test case: ## Master is slower to run the UPDATE mysql[sandbox]> select 'master', @@version ver, timediff(master_end,master_start) lag , (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +--------+------------+----------+-------------------+ | master | ver | lag | num_of_partitions | +--------+------------+----------+-------------------+ | master | 5.6.40-log | 00:00:40 | 370 | +--------+------------+----------+-------------------+ 1 row in set (0.02 sec) ## 5.6 has vastly better replication performance: mysql[sandbox]> select 'slave', @@version ver,timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +-------+------------+----------+-------------------+ | slave | ver | lag | num_of_partitions | +-------+------------+----------+-------------------+ | slave | 5.6.40-log | 00:00:47 | 370 | +-------+------------+----------+-------------------+ 1 row in set (0.01 sec) ## 5.7 results mysql[sandbox]> select 'slave', @@version ver,timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +-------+------------+----------+-------------------+ | slave | ver | lag | num_of_partitions | +-------+------------+----------+-------------------+ | slave | 5.7.22-log | 00:01:20 | 370 | +-------+------------+----------+-------------------+ 1 row in set (0.01 sec) #### Remove partitioning and retest step04_updatet1.sql mysql[sandbox]> use test_73648 Database changed mysql[sandbox]> alter table t1 remove partitioning; Query OK, 55156464 rows affected (13 min 9.81 sec) Records: 55156464 Duplicates: 0 Warnings: 0 mysql[sandbox]> select 'master', @@version ver, timediff(master_end,master_start) lag , (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +--------+------------+----------+-------------------+ | master | ver | lag | num_of_partitions | +--------+------------+----------+-------------------+ | master | 5.6.40-log | 00:00:37 | 1 | +--------+------------+----------+-------------------+ 1 row in set (0.02 sec) mysql[sandbox]> select 'slave', @@version ver,timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +-------+------------+----------+-------------------+ | slave | ver | lag | num_of_partitions | +-------+------------+----------+-------------------+ | slave | 5.6.40-log | 00:00:41 | 1 | +-------+------------+----------+-------------------+ 1 row in set (0.00 sec) mysql[sandbox]> select 'slave', @@version ver,timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +-------+------------+----------+-------------------+ | slave | ver | lag | num_of_partitions | +-------+------------+----------+-------------------+ | slave | 5.7.22-log | 00:00:56 | 1 | +-------+------------+----------+-------------------+ 1 row in set (0.01 sec) ## UPDATE on MySQL 5.6.38 master performs better than 5.6.40 mysql[sandbox]> select 'master', @@version ver, timediff(master_end,master_start) lag , (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb; +--------+------------+----------+-------------------+ | master | ver | lag | num_of_partitions | +--------+------------+----------+-------------------+ | master | 5.6.38-log | 00:00:19 | 1 | +--------+------------+----------+-------------------+ 1 row in set (0.00 sec) I'd be happy to submit a new report if that would be appropriate.