| Bug #27037 | SHOW MASTER STATUS not returning correct position in 5.0 | ||
|---|---|---|---|
| Submitted: | 11 Mar 2007 22:49 | Modified: | 15 May 2007 5:30 |
| Reporter: | Arjen Lentz | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
| Version: | 5.0.38-BK, 5.0.36 | OS: | Linux (Linux, any) |
| Assigned to: | Mats Kindahl | CPU Architecture: | Any |
| Tags: | bfsm_2007_03_15, position, regression, replication, show master status | ||
[11 Mar 2007 22:49]
Arjen Lentz
[11 Mar 2007 23:53]
Arjen Lentz
Could this somehow be related to http://bugs.mysql.com/23890 ? (mysqlbinlog outputs COMMIT unnecessarily when single database is used)
[12 Mar 2007 22:37]
Arjen Lentz
*REGRESSION* bug does not occur in 4.1
[13 Mar 2007 22:08]
Arjen Lentz
Could this be related to the 5.0 changes for XA ?
[13 Mar 2007 22:11]
Mark Leith
Could this be related to the changes we made in binlogging towards XA/xid? The fact it works in 4.1 binlogs and not in 5.0 makes me highly suspicious this may be the cause..
[14 Mar 2007 9:01]
Guilhem Bichot
Arjen, even this is weird: # at 58853196 #070309 13:24:28 server id 893 end_log_pos 58853180 Query thread_id=186048 exec_time=0 error_code=0 SET TIMESTAMP=1173464668; BEGIN; end_log_pos is 58853180 which is SMALLER than the start position given in the #at line (58853196). End if before start... I suggest looking at the raw binary log file with an hex editor, to first check if the event really starts at the position given by #at and really ends at the position given by end_log_pos.
[15 Mar 2007 11:14]
Valeriy Kravchuk
I had created the following test case for MySQL's test suite:
openxs@suse:~/dbs/5.0/mysql-test> cat t/rpl_27037.test
source include/have_innodb.inc;
source include/master-slave.inc;
connection master;
show master status;
drop table if exists foo;
drop table if exists foo2;
create table foo(c1 int) engine=InnoDB;
create table foo2(c1 int);
BEGIN;
insert into foo values (1557);
insert into foo2 values (1557);
COMMIT;
BEGIN;
insert into foo values (1558);
insert into foo2 values (1558);
COMMIT;
BEGIN;
insert into foo values (1559);
insert into foo2 values (1559);
COMMIT;
show master status;
sync_slave_with_master;
connection slave;
lock tables foo read, foo2 read;
show master status;
show slave status;
unlock tables;
But while at least one of the tables (foo and foo2) used in transaction is REALLY created as InnoDB, I have COMMIT in both master's binary log and slave's relay log. And positions are reported correctly (on latest 5.0.38-BK):
openxs@suse:~/dbs/5.0/mysql-test> cat r/rpl_27037.log
stop slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
reset master;
reset slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
start slave;
show master status;
File Position Binlog_Do_DB Binlog_Ignore_DB
master-bin.000001 98
drop table if exists foo;
Warnings:
Note 1051 Unknown table 'foo'
drop table if exists foo2;
Warnings:
Note 1051 Unknown table 'foo2'
create table foo(c1 int) engine=InnoDB;
create table foo2(c1 int);
BEGIN;
insert into foo values (1557);
insert into foo2 values (1557);
COMMIT;
BEGIN;
insert into foo values (1558);
insert into foo2 values (1558);
COMMIT;
BEGIN;
insert into foo values (1559);
insert into foo2 values (1559);
COMMIT;
show master status;
File Position Binlog_Do_DB Binlog_Ignore_DB
master-bin.000001 1302
lock tables foo read, foo2 read;
show master status;
File Position Binlog_Do_DB Binlog_Ignore_DB
slave-bin.000001 1127
show slave status;
Slave_IO_State Master_Host Master_User Master_Port Connect_RetryMas
ter_Log_File Read_Master_Log_Pos Relay_Log_File Relay_Log_Pos Relay_Ma
ster_Log_File Slave_IO_Running Slave_SQL_Running Replicate_Do_DBR
eplicate_Ignore_DB Replicate_Do_Table Replicate_Ignore_Table Replicat
e_Wild_Do_Table Replicate_Wild_Ignore_Table Last_Errno Last_ErrorSkip_C
ounter Exec_Master_Log_Pos Relay_Log_Space Until_Condition Until_Log_FileUn
til_Log_Pos Master_SSL_Allowed Master_SSL_CA_File Master_SSL_CA_Pa
th Master_SSL_Cert Master_SSL_Cipher Master_SSL_Key Seconds_Behind_M
aster
Waiting for master to send event 127.0.0.1 root 9306 1master-
bin.000001 1302 slave-relay-bin.000003 1440 master-bin.000001YesYes0
0 1302 1440 None 0 No 0
unlock tables;
openxs@suse:~/dbs/5.0/mysql-test> ../bin/mysqlbinlog var/log/slave-relay-bin.00
0003
...
# at 1160
#070315 5:18:29 server id 1 end_log_pos 1090 Query thread_id=10 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173928709/*!*/;
BEGIN/*!*/;
# at 1228
#070315 5:18:29 server id 1 end_log_pos 92 Query thread_id=10 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173928709/*!*/;
insert into foo values (1559)/*!*/;
# at 1320
#070315 5:18:29 server id 1 end_log_pos 185 Query thread_id=10 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173928709/*!*/;
insert into foo2 values (1559)/*!*/;
# at 1413
#070315 5:18:29 server id 1 end_log_pos 1302 Xid = 64
COMMIT/*!*/;
# at 1440
#070315 5:18:29 server id 2 end_log_pos 1459 Stop
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
If both foo and foo2 are created as MyISAM for any reason, I do not see LAST COMMIT in master's binary log (all previous are there!). So, if engines are different on master and slave, we surely may have a problem.
Please, check my test case and reasoning.
[15 Mar 2007 14:32]
Valeriy Kravchuk
While SHOW SLAVE STATUS shows correct position in my tests, SHOW MASTER STYATUS can really show incorrect position, just as described. Verified with latest 5.0.38-BK on Linux:
openxs@suse:~/dbs/5.0/mysql-test> ./mysql-test-run.pl rpl_27037
Logging: ./mysql-test-run.pl rpl_27037
MySQL Version 5.0.38
Using ndbcluster when necessary, mysqld supports it
Skipping SSL, mysqld not compiled with SSL
...
TEST RESULT TIME (ms)
-------------------------------------------------------
rpl_27037 [ fail ]
Errors are (from /home/openxs/dbs/5.0/mysql-test/var/log/mysqltest-time) :
/home/openxs/dbs/5.0/bin/mysqltest: Can't get stat of '/home/openxs/dbs/5.0/mysq
l-test/r/rpl_27037.result' (Errcode: 2)
(the last lines may be the most important ones)
Result from queries before failure can be found in r/rpl_27037.log
Aborting: rpl_27037 failed in default mode. To continue, re-run with '--force'.
Stopping All Servers
openxs@suse:~/dbs/5.0/mysql-test> cat r/rpl_27037.log
stop slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
reset master;
reset slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
start slave;
drop table if exists foo;
Warnings:
Note 1051 Unknown table 'foo'
drop table if exists foo2;
Warnings:
Note 1051 Unknown table 'foo2'
create table foo(c1 int) engine=InnoDB;
create table foo2(c1 int) engine=InnoDB;
BEGIN;
insert into foo values (1557);
insert into foo2 values (1557);
COMMIT;
BEGIN;
insert into foo values (1558);
insert into foo2 values (1558);
COMMIT;
BEGIN;
insert into foo values (1559);
insert into foo2 values (1559);
COMMIT;
show slave status;
Slave_IO_State Master_Host Master_User Master_Port Connect_RetryMas
ter_Log_File Read_Master_Log_Pos Relay_Log_File Relay_Log_Pos Relay_Ma
ster_Log_File Slave_IO_Running Slave_SQL_Running Replicate_Do_DBR
eplicate_Ignore_DB Replicate_Do_Table Replicate_Ignore_Table Replicat
e_Wild_Do_Table Replicate_Wild_Ignore_Table Last_Errno Last_ErrorSkip_C
ounter Exec_Master_Log_Pos Relay_Log_Space Until_Condition Until_Log_FileUn
til_Log_Pos Master_SSL_Allowed Master_SSL_CA_File Master_SSL_CA_Pa
th Master_SSL_Cert Master_SSL_Cipher Master_SSL_Key Seconds_Behind_M
aster
Queueing master event to the relay log 127.0.0.1 root 9306 1master-
bin.000001 4 slave-relay-bin.000002 4 Yes Yes00028
9 None 0 No 0
stop slave;
start slave;
lock tables foo read, foo2 read;
show master status;
File Position Binlog_Do_DB Binlog_Ignore_DB
slave-bin.000001 1141
show slave status;
Slave_IO_State Master_Host Master_User Master_Port Connect_RetryMas
ter_Log_File Read_Master_Log_Pos Relay_Log_File Relay_Log_Pos Relay_Ma
ster_Log_File Slave_IO_Running Slave_SQL_Running Replicate_Do_DBR
eplicate_Ignore_DB Replicate_Do_Table Replicate_Ignore_Table Replicat
e_Wild_Do_Table Replicate_Wild_Ignore_Table Last_Errno Last_ErrorSkip_C
ounter Exec_Master_Log_Pos Relay_Log_Space Until_Condition Until_Log_FileUn
til_Log_Pos Master_SSL_Allowed Master_SSL_CA_File Master_SSL_CA_Pa
th Master_SSL_Cert Master_SSL_Cipher Master_SSL_Key Seconds_Behind_M
aster
Waiting for master to send event 127.0.0.1 root 9306 1master-
bin.000001 1316 slave-relay-bin.000004 1454 master-bin.000001YesYes0
0 1316 1454 None 0 No 6
unlock tables;
Note the following results of SHOW MASTER STATUS on slave:
slave-bin.000001 1141
Now let's get content of binary log till that position:
openxs@suse:~/dbs/5.0/mysql-test> ../bin/mysqlbinlog var/log/master-bin.000001
--stop-position=1141
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#070315 6:48:27 server id 1 end_log_pos 98 Start: binlog v 4, server v 5.0.
38-log created 070315 6:48:27 at startup
ROLLBACK/*!*/;
# at 98
#070315 6:48:27 server id 1 end_log_pos 185 Query thread_id=2 exec_tim
e=0 error_code=0
use test/*!*/;
SET TIMESTAMP=1173934107/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.uniq
ue_checks=1/*!*/;
SET @@session.sql_mode=0/*!*/;
/*!\C latin1 *//*!*/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.
collation_server=8/*!*/;
drop table if exists foo/*!*/;
# at 185
#070315 6:48:27 server id 1 end_log_pos 273 Query thread_id=2 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173934107/*!*/;
drop table if exists foo2/*!*/;
# at 273
#070315 6:48:27 server id 1 end_log_pos 374 Query thread_id=2 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173934107/*!*/;
create table foo(c1 int) engine=InnoDB/*!*/;
# at 374
#070315 6:48:27 server id 1 end_log_pos 476 Query thread_id=2 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173934107/*!*/;
create table foo2(c1 int) engine=InnoDB/*!*/;
# at 476
#070315 6:48:27 server id 1 end_log_pos 544 Query thread_id=2 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173934107/*!*/;
BEGIN/*!*/;
# at 544
#070315 6:48:27 server id 1 end_log_pos 92 Query thread_id=2 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173934107/*!*/;
insert into foo values (1557)/*!*/;
# at 636
#070315 6:48:27 server id 1 end_log_pos 185 Query thread_id=2 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173934107/*!*/;
insert into foo2 values (1557)/*!*/;
# at 729
#070315 6:48:27 server id 1 end_log_pos 756 Xid = 16
COMMIT/*!*/;
# at 756
#070315 6:48:27 server id 1 end_log_pos 824 Query thread_id=2 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173934107/*!*/;
BEGIN/*!*/;
# at 824
#070315 6:48:27 server id 1 end_log_pos 92 Query thread_id=2 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173934107/*!*/;
insert into foo values (1558)/*!*/;
# at 916
#070315 6:48:27 server id 1 end_log_pos 185 Query thread_id=2 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173934107/*!*/;
insert into foo2 values (1558)/*!*/;
# at 1009
#070315 6:48:27 server id 1 end_log_pos 1036 Xid = 20
COMMIT/*!*/;
# at 1036
#070315 6:48:27 server id 1 end_log_pos 1104 Query thread_id=2 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173934107/*!*/;
BEGIN/*!*/;
# at 1104
#070315 6:48:27 server id 1 end_log_pos 92 Query thread_id=2 exec_tim
e=0 error_code=0
SET TIMESTAMP=1173934107/*!*/;
insert into foo values (1559)/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
So, no final commit, just as described initially (and no insert to foo2)!
It is not clear what is that 1141 position reported at all! This looks like a bug for me. I used the following test case:
openxs@suse:~/dbs/5.0/mysql-test> cat t/rpl_27037.test
source include/have_innodb.inc;
source include/master-slave.inc;
connection master;
drop table if exists foo;
drop table if exists foo2;
create table foo(c1 int) engine=InnoDB;
create table foo2(c1 int) engine=InnoDB;
BEGIN;
insert into foo values (1557);
insert into foo2 values (1557);
COMMIT;
BEGIN;
insert into foo values (1558);
insert into foo2 values (1558);
COMMIT;
BEGIN;
insert into foo values (1559);
insert into foo2 values (1559);
COMMIT;
save_master_pos;
connection slave;
show slave status;
stop slave;
sleep 5;
start slave;
sync_with_master;
lock tables foo read, foo2 read;
show master status;
show slave status;
unlock tables;
[15 Mar 2007 14:49]
Valeriy Kravchuk
What I missed in that test case is the fact that last show master status; was executed on slave and refered to slave's binary log: start slave; lock tables foo read, foo2 read; show master status; File Position Binlog_Do_DB Binlog_Ignore_DB slave-bin.000001 1141 But still: openxs@suse:~/dbs/5.0/mysql-test> ../bin/mysqlbinlog var/log/slave-bin.000001 /*!40019 SET @@session.max_insert_delayed_threads=0*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 4 #070315 6:57:30 server id 2 end_log_pos 98 Start: binlog v 4, server v 5.0. 38-log created 070315 6:57:30 at startup ROLLBACK/*!*/; ... # at 956 #070315 6:57:31 server id 1 end_log_pos 1048 Query thread_id=2 exec_tim e=5 error_code=0 SET TIMESTAMP=1173934651/*!*/; insert into foo values (1559)/*!*/; # at 1048 #070315 6:57:31 server id 1 end_log_pos 1141 Query thread_id=2 exec_tim e=5 error_code=0 SET TIMESTAMP=1173934651/*!*/; insert into foo2 values (1559)/*!*/; # at 1141 #070315 6:57:36 server id 2 end_log_pos 1160 Stop DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; I do not see COMMIT in that position in slave's binary log.
[15 May 2007 5:30]
Mats Kindahl
This bug was fixed with the patch for BUG#27171.
