Bug #84287 row inserts, statement updates, persistent stats lead to table scans+ lag slaves
Submitted: 20 Dec 2016 19:30
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S4 (Feature request)
Version:5.7.17 OS:Any
Assigned to: CPU Architecture:Any

[20 Dec 2016 19:30] Shane Bester
Description:
If a master is configured with binlog_format=MIXED and there are
inserts done in ROW format,  along with UPDATEs done in statement format,
the slave ends up doing expensive table scans.

How to repeat:
setup master with --binlog_format=mixed.
setup slave with --skip-slave-start
configure replication..

start slave;

on master; setup the test tables.

--
set sql_mode="";
drop table if exists t;

create table t (
  k bigint(20) unsigned not null auto_increment,
  h varchar(99), j varchar(99), f varchar(99),
  e varchar(99), a varchar(99), b varchar(99),
  c varchar(99), d varchar(99), primary key (k),
  key(h),key(f),key(f,h)
) engine=innodb default charset=utf8 ;

drop table if exists r;
create table r (
  id bigint(20) unsigned not null,
  h varchar(99),  a varchar(99),  b varchar(99),
  c varchar(99),  d varchar(99),  e varchar(99),
  f varchar(99),  g varchar(99),  i varchar(99),
  j varchar(99),  k varchar(99),  l varchar(99),
  primary key (id),  key(h),key(i),key(i,h)
) engine=innodb default charset=utf8  ;

Now monitor the slave status carefully without touching any tables.
Do not run any DML/DDL queries on the slave as this could auto-calc table stats.

Run this on master:

mysqlslap.exe --query="insert ignore into t(h,f,j) values(floor(rand()*1000), floor(rand()*1000),floor(rand()*1000));insert ignore into r(id,h,i) values(floor(rand()*1000),floor(rand()*1000),floor(rand()*1000));update r,t set r.a=t.a,r.b=t.b,r.c=t.c,r.d=t.d,r.e=t.e,r.f=null,r.g=null where r.h=t.h and r.i=t.f and left(t.j,2)='11';"  --number-of-queries=10000 --concurrency=1  --create-schema=test  --iterations=10

slave should lag a lot.
slow query log on slave show be evidence of table scans.

Suggested fix:
workaround1:  stop slave; start slave;
workaround2:  analyze table t,r;
workaround3:  on slave, use --innodb_stats_persistent=0
[20 Dec 2016 19:32] Shane Bester
if the query ran fast on master, we expect them to do the same on slave. the workarounds lead me to believe something odd happens here in the interaction with table stats/replication.
[20 Dec 2016 19:44] Shane Bester
output of my slave status monitor during first part of a run. notice it gets further behind each second.

http://pastebin.com/raw/Dcsf4RYX
[21 Dec 2016 5:12] Shane Bester
Monitoring of slave:

http://pastebin.com/raw/f5H2Vc5m

See loops 0 - 62 which is from 21:20:58 to 21:31:18. 
This is when the test was running on master (11 mins).

Slave took until 00:01:59 to catch up (2.5 hours).
[3 May 2017 13:49] Sinisa Milivojevic
Hi!

This turned out not to be a bug, since this was a matter of slave not updating InnoDB statistics. 

However, a new feature is designed that would remedy this problem. However, this is a new feature and would, therefore have another scheduling and another treatment.
[25 Oct 2017 6:11] jiang longfei
bug_test_sql

Attachment: bug1666923.sql.gz (application/gzip, text), 330.85 KiB.