Bug #3871 ANALYZE not in binlog => replication failure on max_join_size
Submitted: 24 May 2004 14:55 Modified: 24 May 2004 20:51
Reporter: Martin Friebe (Gold Quality Contributor)
Status: Closed
Category:Server Severity:S2 (Serious)
Version:4.0.18 OS:FreeBSD (freebsd (maybe any))
Assigned to: Guilhem Bichot Target Version:

[24 May 2004 14:55] Martin Friebe
Description:
ANALYZE TABLE is not written to the binlog. This causes bugs in replication.

on a
"insert into table dest select from src1 left join src2 using (key)"
the slave will estimate a different amount of rows to be read than the master.

The slave will not execute the query and subsequent querys will fail.
( I also had a case where the slave reported
"Error on delete of '/DB/table.MYI' (ErrCode2) on query create ....." )

I ALSO couldnt see any entry for 
"SET MAX_JOIN_SIZE=16" in the bin_log. which means that would cause similiar errors.

How to repeat:
#setup a master slave
# SET the MAX_JOIN_SIZE to 16 (do that in the my.cnf file for server and slave)

drop table if exists t1; drop table if exists t2;

create table t1 (
  a integer,
  b integer,
 index(a), index(b)
);

create table t2 (
  c integer,
  d integer,
 index(c), index(d)
);

insert into t1 values 
(1,2),
(2,2),
(3,2),
(4,2),
(5,2)
;
insert into t2 values 
(1,3),
(2,3),
(1,4),
(2,4),
(1,5),
(2,5)
;

analyze table t2;
select * from t1 left join t2 on a=c;
#server will execute (15 rows), slave fails (30 rows)

Suggested fix:
Replicate ANALYZE and (SET MAX_JOIN_SIZE) to slave.
[24 May 2004 18:18] Guilhem Bichot
Hi,

> analyze table t2;
> select * from t1 left join t2 on a=c;
> #server will execute (15 rows), slave fails (30 rows)

Hum... It would be nice if you had tested it ;) Because the above statement is a SELECT,
so does not need, in any way, to be replicated to the slave, because it does not change
the master's data.
Even with an INSERT SELECT, it does not happen like this. The slave SQL thread has a
special status in the slave MySQL server: it always has its max_join_size set to the
maximum (4G rows), regardless of what the slave has in my.cnf. Precisely to avoid the
problem you imagined.
Starting from 4.1.1 masters, ANALYZE TABLE is replicated to slaves.
For error 2, check "perror 2" in a shell.
[24 May 2004 18:58] Martin Friebe
Well, sorry for not testing, probably, but I currently try to get my slave working again.
Looking at your reply, here the information from my logfiles and some information about
the tables. (Looking at the ammount of data, I cant send a test to reproduce it. (renamed
some fields)

Slave: Error 'Error on delete of './DATA/temp.MYI' (Errcode: 2)' on query 'create
temporary table temp select o.id, max(m.id) mid, n.dst from T1 o left join T2 m on
o.xid=m.xid and m.time_in<o.time_ready and m.time_in<"2004-05-01" and m.xid = "11111" left
join T3 n on n.id_in = o.id where o.time_ready >"2004-04-01" and o.time_ready
<"2004-05-01" group by 1'. Default database: 'DATA', Error_code: 6
040521 16:17:31  Error running query, slave SQL thread aborted. Fix the problem, and
restart the slave SQL thread with "SLAVE START". We stopped at log 'newdb1-bin.075'
position 324192679

the table containd (T1,T2,T3) 2473292, 58725, 2421739 rows

explain on the slave
+-------+------+---------------+-------+---------+-------+---------+----------------------
------------------------+
| table | type | possible_keys | key   | key_len | ref   | rows    | Extra                
                       |
+-------+------+---------------+-------+---------+-------+---------+----------------------
------------------------+
| o     | ALL  | time_ready    | NULL  |    NULL | NULL  | 2473292 | Using where; Using
temporary; Using filesort |
| m     | ref  | xid          | xid  |      31 | o.xid |     587 |                        
                     |
| n     | ref  | ID_IN         | ID_IN |       9 | o.ID  |      18 |                      
                       |
+-------+------+---------------+-------+---------+-------+---------+----------------------
------------------------+
3 rows in set (0.01 sec)

explain on the master
+-------+------+---------------+-------+---------+-------+---------+----------------------
------------------------+
| table | type | possible_keys | key   | key_len | ref   | rows    | Extra                
                       |
+-------+------+---------------+-------+---------+-------+---------+----------------------
------------------------+
| o     | ALL  | time_ready    | NULL  |    NULL | NULL  | 2473292 | Using where; Using
temporary; Using filesort |
| m     | ref  | xid           | xid   |      31 | o.xid |       9 |                      
                       |
| n     | ref  | ID_IN         | ID_IN |       9 | o.ID  |       1 |                      
                       |
+-------+------+---------------+-------+---------+-------+---------+----------------------
------------------------+
3 rows in set (0.03 sec)

The master expect 2G rows, but the slave much over the 4G. 

The Slave should probably not test against 4G (as described by you), but not test at all.
Since the server executed the query, it should be in limit anyway.

perror=2 returns: No such file or directory
[24 May 2004 19:45] Guilhem Bichot
sorry, but you are mixing 2 problems and I am confused.
In this bugs system, we like to have one bug form filled for each bug.
Yes, "perror 2" says "no such file or directory". Without a repeatable testcase it's
impossible to know if this is a bug or a local problem on your machine.
As for max_join_size, indeed the slave will fail if the query examines more than 4G rows,
and this is a bug which you just helped find (the intended behaviour being to consider no
limit), thank you! It will be fixed in the next 4.0 release.
[24 May 2004 20:01] Martin Friebe
I believe the errors belong together, but yes that is a guess. If I am right, the slave
fails the query, because it does the invalid check against 4G. As a result of this it
tries to remove an MYI file, which it has never created.

If that error is unrelated, (cant be reproduced, by forcing the slave in the above error
(I run freebsd 4.8 with mysql 4.0.18), then please consider the file error, as not
reported, as it is then "not reproducable" and/or might really be a local problem.
[24 May 2004 20:06] Martin Friebe
One more thing, well this is a feature request (probably not very important), but
related.

What happened, if the Slave is for some reason not in sync with te master, and will really
produce >4G result rows.

Should the slave check (maybe know how many results the master did finally get), or is
that left to an out of resource error?
(This is as the Slave, will (once fixed) be missing the security of a max_join size)
[24 May 2004 20:51] Guilhem Bichot
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

I just fixed the fact that the slave could fail if there were more than 4G rows to
examine; fix will be in 4.0.21.
For the error 2, let's forget about it until it becomes repeatable.
About this excellent suggestion: "Should the slave check (maybe know how many results the
master did finally get)?", yes, we have this on our TODO list, but it will be in 5.0 or
5.1, because it requires room in the binary log's format to store the row counts of the
master.
Thank you!