Bug #73648 innodb table replication is very slow with some of the partioned table
Submitted: 19 Aug 2014 22:10 Modified: 9 Feb 2018 9:05
Reporter: chandra nanjegowda Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S2 (Serious)
Version:Server version: 5.6.16-log MySQL Communi, 5.6.39 OS:Linux
Assigned to: CPU Architecture:Any
Tags: replication, slave, slow

[19 Aug 2014 22:10] chandra nanjegowda
Description:
We have row based salve replication. Slave replication relay slow with some of the partitioned tables.

if we remove the the partitions, it will run faster. 

How to repeat:
create a table with a lot of columns(we have 500 columns )  and partition(at least 700 partitions) and load 5 to 6 million records per partition for 20 to 30 partitions.
[18 Feb 2015 1:17] Mike Griffin
I am able to reproduce this in MySQL 5.6 using a pretty large table with many partitions. I will attach a self documenting tar file with a test case.

The instructions are in the file step00_instructions.txt within the tar

You can see my results in zz_results.txt, which I will also paste here:

# cat zz_results.txt
System where all six instances where running:

[root@mg8 ~]# grep -m1 MHz /proc/cpuinfo
cpu MHz		: 2600.030
[root@mg8 ~]# grep -c MHz /proc/cpuinfo
8
[root@mg8 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:          8033       7988         45          0         19       2317
-/+ buffers/cache:       5652       2381
Swap:            0          0          0

Relvant settings in my.cnf:

innodb-buffer-pool-size         = 768M
innodb-log-file-size            = 256M

My results:

+-------+------------+----------+-------------------+
| slave | ver        | lag      | num_of_partitions |
+-------+------------+----------+-------------------+
| slave | 5.1.73-log | 00:00:14 |               370 |
+-------+------------+----------+-------------------+
+-------+------------+----------+-------------------+
| slave | ver        | lag      | num_of_partitions |
+-------+------------+----------+-------------------+
| slave | 5.5.41-log | 00:00:16 |               370 |
+-------+------------+----------+-------------------+
+-------+------------+----------+-------------------+
| slave | ver        | lag      | num_of_partitions |
+-------+------------+----------+-------------------+
| slave | 5.6.22-log | 00:05:55 |               370 |
+-------+------------+----------+-------------------+
+-------+------------+----------+-------------------+
| slave | ver        | lag      | num_of_partitions |
+-------+------------+----------+-------------------+
| slave | 5.6.22-log | 00:00:16 |                 1 |
+-------+------------+----------+-------------------+
[18 Feb 2015 1:22] Mike Griffin
My test file is now available in your SFTP server @ /support/incoming/mysql-bug-data-73648.tar
[18 Feb 2015 19:58] Mike Griffin
I wanted to mention that adding additional empty partitions results in 5.6 taking even longer for the same UPDATE to replicate, which perhaps gives a clue about the performance problem to the developers.

It is also worth noting that this may be fixed in 5.7 with InnoDB Native Partitioning[0]:

+-------+------------------------+----------+-------------------+
| slave | ver                    | lag      | num_of_partitions |
+-------+------------------------+----------+-------------------+
| slave | 5.7.5-labs-preview-log | 00:00:44 |               370 |
+-------+------------------------+----------+-------------------+

The 5.7 release side by side with the 5.6 release for a single partition:

mysql[sandbox]> select 'slave', @@version ver, timediff(master_end,master_start) masterexec, timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from  test_73648.hb;
+-------+------------+------------+----------+-------------------+
| slave | ver        | masterexec | lag      | num_of_partitions |
+-------+------------+------------+----------+-------------------+
| slave | 5.6.22-log | 00:00:18   | 00:00:16 |                 1 |
+-------+------------+------------+----------+-------------------+
1 row in set (0.00 sec)

mysql[sandbox]> select 'slave', @@version ver, timediff(master_end,master_start) masterexec, timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from  test_73648.hb;
+-------+------------------------+------------+----------+-------------------+
| slave | ver                    | masterexec | lag      | num_of_partitions |
+-------+------------------------+------------+----------+-------------------+
| slave | 5.7.5-labs-preview-log | 00:00:53   | 00:00:55 |                 1 |
+-------+------------------------+------------+----------+-------------------+
1 row in set (0.00 sec)

*[0] http://mysqlserverteam.com/innodb-native-partitioning-early-access/
[8 Apr 2015 1:18] Mike Griffin
This performance regression for partitioned tables is still present in the latest release:

Server version: 5.6.24-log MySQL Community Server (GPL)

Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql[sandbox]> select * from hb;
+----+---------------------+---------------------+---------------------+
| id | master_start        | master_end          | slave_end           |
+----+---------------------+---------------------+---------------------+
|  1 | 2015-04-08 00:59:20 | 2015-04-08 00:59:43 | 2015-04-08 01:05:38 |
+----+---------------------+---------------------+---------------------+
1 row in set (0.00 sec)

Removing partitioning, downgrading to 5.5, or converting the table to MyISAM solves the problem.
[6 Feb 2018 21:09] Arnaud Adant
Yes it looks that this bug was never verified not addressed in 5.6.
In 5.7. it is fixed with native partitioning.

Here is a test case :

--source include/have_binlog_format_row.inc
--source include/have_partition.inc
--source include/master-slave.inc
--source include/have_innodb.inc
CREATE TABLE t (
  id bigint(20) unsigned NOT NULL auto_increment,
  d date NOT NULL,
  ts datetime(6) NOT NULL,
  PRIMARY KEY (id,d)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50500 PARTITION BY HASH(id) partitions 200*/;

insert into t(d,ts) values(current_date(),now());
insert into t(d,ts) values(current_date(),now());
insert into t(d,ts) values(current_date(),now());
insert into t(d,ts) values(current_date(),now());
replace into t(d,ts) select t1.d,t1.ts from t t1, t t2, t t3, t t4, t t5, t t6, t t7, t t8, t t9, t t10;
commit;
update t set ts = now();
commit;
drop table t;
--sync_slave_with_master
stop slave;
[6 Feb 2018 21:11] Arnaud Adant
A workaround is to set binlog_format=statement before executing the update.
[7 Feb 2018 1:12] Mike Griffin
Results are quite a bit better in 5.7 than was seen in 5.6, using my previously supplied test case:

mysql>  select 'slave', @@version ver, timediff(master_end,master_start) masterexec, timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from  test_73648.hb;
+-------+--------+------------+----------+-------------------+
| slave | ver    | masterexec | lag      | num_of_partitions |
+-------+--------+------------+----------+-------------------+
| slave | 5.7.21 | 00:00:24   | 00:00:52 |               370 |
+-------+--------+------------+----------+-------------------+
1 row in set (0.03 sec)

mysql>  select 'slave', @@version ver, timediff(master_end,master_start) masterexec, timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from  test_73648.hb;
+-------+--------+------------+----------+-------------------+
| slave | ver    | masterexec | lag      | num_of_partitions |
+-------+--------+------------+----------+-------------------+
| slave | 5.7.21 | 00:00:18   | 00:00:21 |                 1 |
+-------+--------+------------+----------+-------------------+
1 row in set (0.00 sec)

mysql> select 'master', @@version ver,  timediff(master_end,master_start) lag ,  (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb;
+--------+------------+----------+-------------------+
| master | ver        | lag      | num_of_partitions |
+--------+------------+----------+-------------------+
| master | 5.7.21-log | 00:00:24 |               370 |
+--------+------------+----------+-------------------+
1 row in set (0.04 sec)

mysql> select 'slave', @@version ver,timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from  test_73648.hb;
+-------+--------+----------+-------------------+
| slave | ver    | lag      | num_of_partitions |
+-------+--------+----------+-------------------+
| slave | 5.7.21 | 00:00:52 |               370 |
+-------+--------+----------+-------------------+
1 row in set (0.00 sec)

mysql> use test_73648
Database changed
mysql> select * from hb;
+----+---------------------+---------------------+---------------------+
| id | master_start        | master_end          | slave_end           |
+----+---------------------+---------------------+---------------------+
|  1 | 2018-02-07 01:00:00 | 2018-02-07 01:00:24 | 2018-02-07 01:01:16 |
+----+---------------------+---------------------+---------------------+
1 row in set (0.01 sec)

So with partitions is still about 2x slower in 5.7. 

In 5.6 it was about 24x slower with partitions.

In 5.1/5.5, partitioning did not seem to affect replication performance

Note that the config/hardware of the 5.7 test may be slightly different than the previous samples, but the masterexec vs lag is what would matter here
[7 Feb 2018 1:21] Mike Griffin
Displayed more clearly:

mysql> select timediff(master_end,master_start) as `master`, timediff(slave_end,master_end) as `slave` from hb;
+----------+----------+
| master   | slave    |
+----------+----------+
| 00:00:19 | 00:00:46 |
+----------+----------+
1 row in set (0.01 sec)
[7 Feb 2018 10:12] Arnaud Adant
>In 5.1/5.5, partitioning did not seem to affect replication performance

Are you sure you were using binlog_format=row in 5.1 and 5.5 ?

Using MTR and the test I pasted previously, I get this :

5.5.59 :

bug.test                                 [ pass ]  33140
bug.test_no_partition                    [ pass ]  15328

5.6.39

bug.test                                 [ pass ]  158594  <== bug
bug.test_no_partition                    [ pass ]  20350

5.7.21

bug.test                                 [ pass ]  37490
bug.test_no_partition                    [ pass ]  26441

8.0.4 rc

bug.test                                 [ pass ]  48723
bug.test_no_partition                    [ pass ]  40933

There is always an overhead with partitioning, especially large number of partitions. Also there is a decrease in performance for single thread over time.

http://smalldatum.blogspot.co.uk/2017/07/i-overstated-cpu-regression-in-mysql-57.html
[8 Feb 2018 17:07] Ravi Sastry Karra
#While performing the data/table replication between two servers (master-slave), #tried importing the slave dump .sql data into a table (dump copied from #master). Now in slave,

#System details:
mysql -V
mysql  Ver 14.14 Distrib 5.5.54, for Linux (x86_64) using readline 5.1

#Command line:
mysql -u -root -p NokiaLTE_Performace_T < NokiaLTE_Performace_T.sql 

ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'SSH-2.0-OpenSSH_5.3' at line 1

#what is the fix for this error?
[9 Feb 2018 9:05] MySQL Verification Team
Thank you for the report and feedback.
Observed that 5.6.39 is affected(partitioned table + RBR).

Thanks,
Umesh
[9 Feb 2018 9:06] MySQL Verification Team
test results

Attachment: 73648.results (application/octet-stream, text), 10.60 KiB.

[27 Jun 2018 8:38] Arnaud Adant
This bug is most likely fixed 

InnoDB: Replication lag occurred on slave instances during large update operations on tables with many partitions. (Bug #25687813, Bug #85352)

see release notes for 5.6.38)
[6 Jul 2018 3:09] Mike Griffin
This seems to have a positive effect on the slave and negative on the master. The performance of the MySQL 5.6 slave is far better, using my originally submitted test case:

## Master is slower to run the UPDATE

mysql[sandbox]> select 'master', @@version ver,  timediff(master_end,master_start) lag ,  (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb;
+--------+------------+----------+-------------------+
| master | ver        | lag      |                 num_of_partitions |
+--------+------------+----------+-------------------+
| master | 5.6.40-log | 00:00:40 |                             370 |
+--------+------------+----------+-------------------+
1 row in set (0.02 sec)

## 5.6 has vastly better replication performance:

mysql[sandbox]> select 'slave', @@version ver,timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from  test_73648.hb;
+-------+------------+----------+-------------------+
| slave | ver        | lag      | num_of_partitions |
+-------+------------+----------+-------------------+
| slave | 5.6.40-log | 00:00:47 |               370 |
+-------+------------+----------+-------------------+
1 row in set (0.01 sec)

## 5.7 results

mysql[sandbox]> select 'slave', @@version ver,timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from  test_73648.hb;
+-------+------------+----------+-------------------+
| slave | ver        | lag      | num_of_partitions |
+-------+------------+----------+-------------------+
| slave | 5.7.22-log | 00:01:20 |               370 |
+-------+------------+----------+-------------------+
1 row in set (0.01 sec)

#### Remove partitioning and retest step04_updatet1.sql

mysql[sandbox]> use test_73648
Database changed
mysql[sandbox]> alter table t1 remove partitioning;
Query OK, 55156464 rows affected (13 min 9.81 sec)
Records: 55156464  Duplicates: 0  Warnings: 0

mysql[sandbox]> select 'master', @@version ver,  timediff(master_end,master_start) lag ,  (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb;
+--------+------------+----------+-------------------+
| master | ver        | lag      | num_of_partitions |
+--------+------------+----------+-------------------+
| master | 5.6.40-log | 00:00:37 |                 1 |
+--------+------------+----------+-------------------+
1 row in set (0.02 sec)

mysql[sandbox]> select 'slave', @@version ver,timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from  test_73648.hb;
+-------+------------+----------+-------------------+
| slave | ver        | lag      | num_of_partitions |
+-------+------------+----------+-------------------+
| slave | 5.6.40-log | 00:00:41 |                 1 |
+-------+------------+----------+-------------------+
1 row in set (0.00 sec)

mysql[sandbox]> select 'slave', @@version ver,timediff(slave_end,master_end) lag, (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from  test_73648.hb;
+-------+------------+----------+-------------------+
| slave | ver        | lag      | num_of_partitions |
+-------+------------+----------+-------------------+
| slave | 5.7.22-log | 00:00:56 |                 1 |
+-------+------------+----------+-------------------+
1 row in set (0.01 sec)

## UPDATE on MySQL 5.6.38 master performs better than 5.6.40

mysql[sandbox]> select 'master', @@version ver,  timediff(master_end,master_start) lag ,  (select count(*) from information_schema.partitions where table_schema='test_73648' and table_name='t1') num_of_partitions from test_73648.hb;
+--------+------------+----------+-------------------+
| master | ver        | lag      | num_of_partitions |
+--------+------------+----------+-------------------+
| master | 5.6.38-log | 00:00:19 |                 1 |
+--------+------------+----------+-------------------+
1 row in set (0.00 sec)

I'd be happy to submit a new report if that would be appropriate.