Bug #3871 ANALYZE not in binlog => replication failure on max_join_size
Submitted: 24 May 2004 12:55 Modified: 24 May 2004 18:51
Reporter: Martin Friebe (Gold Quality Contributor) (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:4.0.18 OS:FreeBSD (freebsd (maybe any))
Assigned to: Guilhem Bichot CPU Architecture:Any

[24 May 2004 12:55] Martin Friebe
Description:
ANALYZE TABLE is not written to the binlog. This causes bugs in replication.

on a
"insert into table dest select from src1 left join src2 using (key)"
the slave will estimate a different amount of rows to be read than the master.

The slave will not execute the query and subsequent querys will fail.
( I also had a case where the slave reported
"Error on delete of '/DB/table.MYI' (ErrCode2) on query create ....." )

I ALSO couldnt see any entry for 
"SET MAX_JOIN_SIZE=16" in the bin_log. which means that would cause similiar errors.

How to repeat:
#setup a master slave
# SET the MAX_JOIN_SIZE to 16 (do that in the my.cnf file for server and slave)

drop table if exists t1; drop table if exists t2;

create table t1 (
  a integer,
  b integer,
 index(a), index(b)
);

create table t2 (
  c integer,
  d integer,
 index(c), index(d)
);

insert into t1 values 
(1,2),
(2,2),
(3,2),
(4,2),
(5,2)
;
insert into t2 values 
(1,3),
(2,3),
(1,4),
(2,4),
(1,5),
(2,5)
;

analyze table t2;
select * from t1 left join t2 on a=c;
#server will execute (15 rows), slave fails (30 rows)

Suggested fix:
Replicate ANALYZE and (SET MAX_JOIN_SIZE) to slave.
[24 May 2004 16:18] Guilhem Bichot
Hi,

> analyze table t2;
> select * from t1 left join t2 on a=c;
> #server will execute (15 rows), slave fails (30 rows)

Hum... It would be nice if you had tested it ;) Because the above statement is a SELECT, so does not need, in any way, to be replicated to the slave, because it does not change the master's data.
Even with an INSERT SELECT, it does not happen like this. The slave SQL thread has a special status in the slave MySQL server: it always has its max_join_size set to the maximum (4G rows), regardless of what the slave has in my.cnf. Precisely to avoid the problem you imagined.
Starting from 4.1.1 masters, ANALYZE TABLE is replicated to slaves.
For error 2, check "perror 2" in a shell.
[24 May 2004 16:58] Martin Friebe
Well, sorry for not testing, probably, but I currently try to get my slave working again.
Looking at your reply, here the information from my logfiles and some information about the tables. (Looking at the ammount of data, I cant send a test to reproduce it. (renamed some fields)

Slave: Error 'Error on delete of './DATA/temp.MYI' (Errcode: 2)' on query 'create temporary table temp select o.id, max(m.id) mid, n.dst from T1 o left join T2 m on o.xid=m.xid and m.time_in<o.time_ready and m.time_in<"2004-05-01" and m.xid = "11111" left join T3 n on n.id_in = o.id where o.time_ready >"2004-04-01" and o.time_ready <"2004-05-01" group by 1'. Default database: 'DATA', Error_code: 6
040521 16:17:31  Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'newdb1-bin.075' position 324192679

the table containd (T1,T2,T3) 2473292, 58725, 2421739 rows

explain on the slave
+-------+------+---------------+-------+---------+-------+---------+----------------------------------------------+
| table | type | possible_keys | key   | key_len | ref   | rows    | Extra                                        |
+-------+------+---------------+-------+---------+-------+---------+----------------------------------------------+
| o     | ALL  | time_ready    | NULL  |    NULL | NULL  | 2473292 | Using where; Using temporary; Using filesort |
| m     | ref  | xid          | xid  |      31 | o.xid |     587 |                                              |
| n     | ref  | ID_IN         | ID_IN |       9 | o.ID  |      18 |                                              |
+-------+------+---------------+-------+---------+-------+---------+----------------------------------------------+
3 rows in set (0.01 sec)

explain on the master
+-------+------+---------------+-------+---------+-------+---------+----------------------------------------------+
| table | type | possible_keys | key   | key_len | ref   | rows    | Extra                                        |
+-------+------+---------------+-------+---------+-------+---------+----------------------------------------------+
| o     | ALL  | time_ready    | NULL  |    NULL | NULL  | 2473292 | Using where; Using temporary; Using filesort |
| m     | ref  | xid           | xid   |      31 | o.xid |       9 |                                              |
| n     | ref  | ID_IN         | ID_IN |       9 | o.ID  |       1 |                                              |
+-------+------+---------------+-------+---------+-------+---------+----------------------------------------------+
3 rows in set (0.03 sec)

The master expect 2G rows, but the slave much over the 4G. 

The Slave should probably not test against 4G (as described by you), but not test at all. Since the server executed the query, it should be in limit anyway.

perror=2 returns: No such file or directory
[24 May 2004 17:45] Guilhem Bichot
sorry, but you are mixing 2 problems and I am confused.
In this bugs system, we like to have one bug form filled for each bug.
Yes, "perror 2" says "no such file or directory". Without a repeatable testcase it's impossible to know if this is a bug or a local problem on your machine.
As for max_join_size, indeed the slave will fail if the query examines more than 4G rows, and this is a bug which you just helped find (the intended behaviour being to consider no limit), thank you! It will be fixed in the next 4.0 release.
[24 May 2004 18:01] Martin Friebe
I believe the errors belong together, but yes that is a guess. If I am right, the slave fails the query, because it does the invalid check against 4G. As a result of this it tries to remove an MYI file, which it has never created.

If that error is unrelated, (cant be reproduced, by forcing the slave in the above error (I run freebsd 4.8 with mysql 4.0.18), then please consider the file error, as not reported, as it is then "not reproducable" and/or might really be a local problem.
[24 May 2004 18:06] Martin Friebe
One more thing, well this is a feature request (probably not very important), but related.

What happened, if the Slave is for some reason not in sync with te master, and will really produce >4G result rows.

Should the slave check (maybe know how many results the master did finally get), or is that left to an out of resource error?
(This is as the Slave, will (once fixed) be missing the security of a max_join size)
[24 May 2004 18:51] Guilhem Bichot
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

I just fixed the fact that the slave could fail if there were more than 4G rows to examine; fix will be in 4.0.21.
For the error 2, let's forget about it until it becomes repeatable.
About this excellent suggestion: "Should the slave check (maybe know how many results the master did finally get)?", yes, we have this on our TODO list, but it will be in 5.0 or 5.1, because it requires room in the binary log's format to store the row counts of the master.
Thank you!