Bug #34058 | InnoDB recovery prints wrong master binlog position | ||
---|---|---|---|
Submitted: | 25 Jan 2008 15:21 | Modified: | 27 Jan 2009 3:30 |
Reporter: | Baron Schwartz (Basic Quality Contributor) | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S3 (Non-critical) |
Version: | 5.0.40, 5.0.60sp1 | OS: | Linux (Gentoo) |
Assigned to: | Assigned Account | CPU Architecture: | Any |
Tags: | qc |
[25 Jan 2008 15:21]
Baron Schwartz
[25 Jan 2008 15:38]
Heikki Tuuri
Baron, I am afraid MySQL AB developers removed the code that stores the MASTER binlog position of a SLAVE in InnoDB's data files. And we have forgotten to remove the printf() from the InnoDB crash recovery code. I hope this one is still correct: InnoDB: Last MySQL binlog file position 0 589600615, file name ./galax-bin.001376 ? I hope not too many users will suffer from the removal of this feature. Vasil, please check if the MASTER binlog position storage of a SLAVE has been removed in 5.0, and remove the corresponding InnoDB printf() if that is the case. Or was it removed only in 5.1? Regards, Heikki
[25 Jan 2008 15:49]
Baron Schwartz
That is very disappointing. This is an extremely useful feature if it works :-( The binlog position is still correct, though.
[25 Jan 2008 16:02]
Peter Zaitsev
Heikki, Can you tell what version this was ? Was it actually intensional or accidental ? There are VERY many cases when it is extremely helpful. For example it is about only way to deal with Slave which has power failure (so its master.info file is out of sync and relay logs probably trashed)
[27 Jan 2008 15:40]
Baron Schwartz
I can't see when this change was committed. The most relevant message on the commits list I see is this: http://lists.mysql.com/commits/11693 But that one just removes code that was already commented out.
[27 Jan 2008 19:48]
Mark Callaghan
We put that functionality back in the Google patch. It is enabled when rpl_transaction_enabled=1 is set.
[28 Jan 2008 16:32]
Heikki Tuuri
Either 4.0 or 4.1 did have this code. The removal may be associated with the XA changes to the code. Vasil is studying this now.
[29 Jan 2008 16:40]
Mark Callaghan
The code that existed in 4.0 used innobase_commit_low to set the master log offset and filename in the InnoDB tx log. That doesn't catch all operations that commit to InnoDB because innobase_commit_low is a no-op for some calls from DDL code in ha_innodb.cc. The problem is that this code causes an early return from innobase_commit_low: if (trx->conc_state == TRX_NOT_STARTED) { return; } Callers where it does nothing are (from the 5.0.37 code I use now): ha_innobase::delete_table innobase_drop_database ha_innobase::rename_table
[29 Jan 2008 16:58]
Vasil Dimov
Mark, That code still exists in 5.0, but (as reported here by Baron) something is broken...
[29 Jan 2008 17:07]
Baron Schwartz
This email thread on internals is about this bug: http://lists.mysql.com/internals/35290
[30 Jan 2008 14:33]
Vasil Dimov
I set up a replication with latest 5.0 from BK (5.0.56) and deployed a call to trx_sys_print_mysql_master_log_pos() in the beginning of innobase_commit_low() on the slave server for diagnostic purposes. Then (autocommit is on by default): on the master: insert into t values (7); on the slave observed: InnoDB: In a MySQL replication slave the last master binlog file InnoDB: position 0 4352, file name mysql-bin.000003 on the master: ./bin/mysqlbinlog --start-position=4352 var/mysql-bin.000003 /*!40019 SET @@session.max_insert_delayed_threads=0*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 4352 #080130 16:04:40 server id 1 end_log_pos 4379 Xid = 61 COMMIT/*!*/; # at 4379 #080130 16:13:32 server id 1 end_log_pos 4466 Query thread_id=3 exec_time=0 error_code=0 use test/*!*/; SET TIMESTAMP=1201702412/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.unique_checks=1/*!*/; SET @@session.sql_mode=0/*!*/; /*!\C cp1251 *//*!*/; SET @@session.character_set_client=51,@@session.collation_connection=51,@@session.collation_server=8/*!*/; insert into t values (7)/*!*/; # at 4466 #080130 16:13:32 server id 1 end_log_pos 4493 Xid = 65 COMMIT/*!*/; DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; I.e. it looks correct. It is also correct if I kill -9 the mysql slave server and restart it. As Mark said, that value is not updated on the slave when (for example) DROP TABLE is executed, in that case the last printed value is printed again. I have a few questoins: * Baron, do you have any idea why there is so huge difference between the printed and the actual value in your setup? You did not run thousands of DROP TABLE, did you? :) * Mark, what do you mean by "We put that functionality back in the Google patch."? Can you point which code was present, was removed and you did put back? Thanks!
[30 Jan 2008 14:56]
Mark Callaghan
The Google patch for MySQL 4.0 and 5.0 is at http://code.google.com/p/google-mysql-tools This does not yet include fixes to also update the replication offset during DDL (drop database, drop table, ...). We use this for transactional replication on the slave. During crash recovery it updates relay-log.info to match the offset stored in the innodb tx log. This does a bit more than the original use, so many more files have changed.
[30 Jan 2008 18:14]
Baron Schwartz
Vasil, I don't think the DROP TABLE etc should be the problem :-) I don't know why the difference was so large. I assumed the log byte offset was just from a variable that was never initialized, and was never correct. (It looked to me like the code that set it was commented out completely).
[1 Feb 2008 23:04]
Peter Zaitsev
Vasil, In my experience it is sometimes correct though in most cases it does not. This actually makes it only worse :)
[6 Feb 2008 17:33]
Baron Schwartz
Interesting: I just initialized a new server, this time from one of the masters, which used to be a slave itself in the distant past. This time, it says: InnoDB: In a MySQL replication slave the last master binlog file InnoDB: position 0 214, file name usa-bin.000698 InnoDB: Last MySQL binlog file position 0 1060018864, file name ./mysql-bin.000060 But usa hasn't been this server's master since March of last year :-)
[4 Mar 2008 17:44]
Heikki Tuuri
Hmm... how can we preserve this functionality in 5.1, where all communication between MySQL and the engine must go through the plugin interface? Vasil, is the DROP TABLE bug in 5.0 easy to fix?
[2 May 2008 16:42]
Gordon Shannon
This is happening to me in 5.1.24-community, and it's a very serious issue. In my case, the message is similar to Baron's -- the file is correct but the file position is an absurdly low number -- 541, which is in the middle the 2nd item in the log. We use Innodb Hot Backup to take backups of a slave, then use that backup to recover the slave, either the one backed up, or a new one on another server. I know of no other way to set the restored slave on the right position, other than using this line in the error log on startup. InnoDB: In a MySQL replication slave the last master binlog file InnoDB: position 0 541, file name mysql-bin.000521 Is there a workaround? E.g. some output text by innodb hot backup?? If not, then I suggest that this is a S1 critical bug, since it prevents us from database recovery! Thanks!
[2 May 2008 17:01]
Baron Schwartz
A possibly relevant section of code, in sql/ha_innodb.cc in the 5.0.45 source: 1718 /* TODO: put the 1719 MySQL-4.1 functionality back to 5.0. This is needed to get InnoDB Hot Backup 1720 to work. */ 1721 1722 /********************************************************************* 1723 This is called when MySQL writes the binlog entry for the current 1724 transaction. Writes to the InnoDB tablespace info which tells where the 1725 MySQL binlog entry for the current transaction ended. Also commits the 1726 transaction inside InnoDB but does NOT flush InnoDB log files to disk. 1727 To flush you have to call innobase_commit_complete(). We have separated 1728 flushing to eliminate the bottleneck of LOCK_log in log.cc which disabled 1729 InnoDB's group commit capability. */ 1730 1731 int 1732 innobase_report_binlog_offset_and_commit( 1733 /*=====================================*/ 1734 /* out: 0 */ 1735 THD* thd, /* in: user thread */ 1736 void* trx_handle, /* in: InnoDB trx handle */ 1737 char* log_file_name, /* in: latest binlog file name */ 1738 my_off_t end_offset) /* in: the offset in the binlog file 1739 up to which we wrote */ 1740 { 1741 trx_t* trx; 1742 1743 trx = (trx_t*)trx_handle; 1744 1745 ut_a(trx != NULL); 1746 1747 trx->mysql_log_file_name = log_file_name; 1748 trx->mysql_log_offset = (ib_longlong)end_offset; 1749 1750 trx->flush_log_later = TRUE; 1751 1752 innobase_commit(thd, TRUE); 1753 1754 trx->flush_log_later = FALSE; 1755 1756 return(0); 1757 }
[2 May 2008 17:11]
Heikki Tuuri
Gordon, Baron, I think InnoDB-5.0 and 5.1 print the binlog position right (except for some error if a DROP TABLE happens at the right place). At least, in Vasil's recent test they worked ok. But they do NOT print right the master binlog position of a replication slave. That functionality worked in 4.x. Regards, Heikki
[2 May 2008 17:34]
Gordon Shannon
Yes I believe Heikki is correct, and it's good to be clear about this. There are 3 relevant lines in the error log when you first start the db on recovery: InnoDB: In a MySQL replication slave the last master binlog file InnoDB: position 0 541, file name mysql-bin.000521 InnoDB: Last MySQL binlog file position 0 975, file name ./mysql-bin.000008 When recovering a master backup, there is no problem because line #3 is indeed correct -- it refers to its own binlog position. The problem is only on replication slaves, since the position, and maybe file, in line #2 are wrong. It would be great if Innodb Hot Backup could deduce this information and output it! (hint, hint, hint :-), then I wouldn't have to rely on this. Please correct me if I'm wrong. This means our strategy of using replication slaves for backups is compromised, as recovery is impossible. We must instead use only hot backups made from the master.
[2 May 2008 17:43]
Baron Schwartz
Just to clarify, as Peter noted, knowing the master binlog position that the slave's data corresponds to is very, very useful. It's a real shame this doesn't work anymore, because it eliminates all sorts of neat things you used to be able to do.
[17 Jul 2008 10:34]
Vasil Dimov
Can someone give a short summary how to repeat this?
[17 Jul 2008 10:50]
Vasil Dimov
If I run a master/slave, have some activity on the master and kill -9 the slave, then stop the activity on the master, then start the slave I get this: --- cut --- $ ./libexec/mysqld --socket /tmp/bug34058/mysql-5.0-s/sock --port 3307 --server-id 2 --relay-log=hag-relay-bin InnoDB: Log scan progressed past the checkpoint lsn 0 54773 080717 13:45:26 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... InnoDB: Doing recovery: scanned up to log sequence number 0 55065 080717 13:45:26 InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percents: 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed InnoDB: In a MySQL replication slave the last master binlog file InnoDB: position 0 18994, file name m-bin.000002 080717 13:45:26 InnoDB: Started; log sequence number 0 55065 080717 13:45:26 [Note] ./libexec/mysqld: ready for connections. Version: '5.0.68' socket: '/tmp/bug34058/mysql-5.0-s/sock' port: 3307 Source distribution 080717 13:45:26 [Note] Slave SQL thread initialized, starting replication in log 'm-bin.000002' at position 19021, relay log './hag-relay-bin.000017' position: 3489 080717 13:45:26 [Note] Slave I/O thread: connected to master 'root@localhost:3306', replication started in log 'm-bin.000002' at position 19021 --- cut --- If "InnoDB: position 0 18994, file name m-bin.000002" is wrong, then why don't you ignore it and use "starting replication in log 'm-bin.000002' at position 19021", i.e. use 19021 instead of 18994?
[18 Jul 2008 11:56]
Vasil Dimov
The value printed by InnoDB during recovery (18994) is read from the InnoDB files (sys_header + TRX_SYS_MYSQL_MASTER_LOG_INFO). The value printed by MySQL code (19021) is read from var/master.info. If the slave crashes these can end up to be different.
[21 Jul 2008 10:51]
Vasil Dimov
The value, printed by InnoDB can also be different from the value printed by MySQL if you are changing MyISAM tables. The value printed by InnoDB is updated on the disk only at commit (in trx_commit_off_kernel()).
[22 Jul 2008 12:25]
Vasil Dimov
I am unable to reproduce a large difference in the numbers printed by InnoDB and MySQL like the one reported by Baron: 115 vs. 643703376. I can get small differences when the slave crashes. The difference being that InnoDB-printed log position corresponds to the last COMMIT statement seen by InnoDB while MySQL-printed log position may correspond to the next statement or a few statements after it. Baron, I assume you are not running one HUGE transaction that creates 600k binlog entries? You also explicitly mentioned that you are using only InnoDB tables, so the reason is not that you have a lot of activity on e.g. MyISAM tables. I used 5.0.68, can someone please give me some hint on how to reproduce this problem?
[22 Jul 2008 12:28]
Vasil Dimov
A note to my previous comment: Baron did not say that 643703376 is printed by MySQL, he said: "The REAL position, as I found by examining the binary log and master.info after recovery, and looking at the log events, is portland-bin.001717, position 643703376." Anyway, if 643703376 is in master.info then MySQL would printing during startup.
[22 Jul 2008 17:47]
Gordon Shannon
From my perspective, the problem occurs when doing a restore of an innodb hot backup, where there is no master.info file. In prior versions, I could rely on the startup text "In a MySQL replication slave, the last master binlog file position 0 999999..." to correctly position the recovered slave via change master.
[23 Jul 2008 17:28]
Vasil Dimov
Gordon, I can do similar thing with the latest MySQL 5.0.68: * I setup a replication and made some activity on the master * "kill -9"d the slave (instead of taking backup with hotbackup) * removed var/master.info on the slave server * started the slave and read "InnoDB: position 0 20392, file name m-bin.000001" during recovery * issued CHANGE MASTER ... MASTER_LOG_POS=20392; START SLAVE; it went without errors or any subsequent problems. Any hints on how to repeat the problem are welcome. Could this be some Hot Backup-only related issue? Setting to "Can't repeat" as I think it is more appropriate than "Need feedback".
[23 Jul 2008 17:33]
Vasil Dimov
One possible explanation: replication is not running at all. Then InnoDB prints some old position.
[23 Jul 2008 17:36]
Vasil Dimov
OR The position was indeed 115 when the backup was taken. Then it advanced to 643703376. When Baron tries to restore the backup, then InnoDB sees 115?
[23 Jul 2008 17:44]
Gordon Shannon
I'm running on 5.1.24. I'll leave it to Baron to respond from his perspective. I only chimed in because it seemed like the same issue.
[23 Jul 2008 17:59]
Vasil Dimov
Hmm, I never tried 5.1.x as the reported issue is against 5.0.40. Things may be different in 5.1, I will probably try that one too to see how it behaves... Thanks!
[2 Oct 2008 13:27]
Valeriy Kravchuk
Vasil, Had you tried to repeat with recent 5.1.x? We also have some reports claiming that 5.0.44 is also affected...
[14 Oct 2008 2:20]
Adam Dixon
cmds to reproduce
Attachment: 34058_reproduce.txt (text/plain), 3.23 KiB.
[14 Oct 2008 2:23]
Adam Dixon
This is reproducable in 5.0.44 and 5.0.60sp1, see the previous attachment for full process I used in my vm. Basically; ## Have Master setup, ## Run infinite (loopy insert of data perl script) for a while ## Take LVM Snapshot, mount and copy datafiles to create a slave. ## Get the slave started, and run infinite script some more on the master. ## Then kill -9 mysql on slave ## Then restart mysql on slave with skip-slave-start and check the output from mysql error log. Checking SHOW MASTER STATUS and SHOW SLAVE STATUS just before kill-9 of slave, then cat the slave.err; [root@localhost mysql]# mysql -u root -e "SHOW MASTER STATUS;"; +-------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +-------------------+----------+--------------+------------------+ | master-bin.000001 | 4160333 | | | +-------------------+----------+--------------+------------------+ [root@localhost mysql]# mysql -u root -S /tmp/mysql4406.sock -e 'SHOW SLAVE STATUS\G'; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: root Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000001 Read_Master_Log_Pos: 4160333 Relay_Log_File: localhost-relay-bin.000002 Relay_Log_Pos: 2079085 Relay_Master_Log_File: master-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 4160333 Relay_Log_Space: 2079085 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 [root@localhost mysql]# kill -9 `cat /usr/local/mysqlslave/localhost.localdomain.pid` [root@localhost mysql]# bin/mysqld_safe: line 388: 4345 Killed nohup /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysqlslave/ --user=mysql --pid-file=/usr/local/mysqlslave//localhost.localdomain.pid --skip-external-locking --port=4406 --socket=/tmp/mysql4406.sock --server-id\=2 --skip-slave-start --default-storage-engine\=innodb >> /usr/local/mysqlslave/slave.err 2>&1 Number of processes running now: 0 081014 23:07:55 mysqld restarted [root@localhost mysql]# kill `pidof perl`; [2]- Terminated perl /root/34058issue/infinite_insert.pl [root@localhost mysql]# [root@localhost mysql]# [root@localhost mysql]# [root@localhost mysql]# cat /usr/local/mysqlslave/slave.err 081014 23:05:38 mysqld started InnoDB: The log sequence number in ibdata files does not match InnoDB: the log sequence number in the ib_logfiles! 081014 23:05:38 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... InnoDB: Last MySQL binlog file position 0 2081484, file name ./master-bin.000001 081014 23:05:38 InnoDB: Started; log sequence number 0 6599325 081014 23:05:38 [Note] /usr/local/mysql/bin/mysqld: ready for connections. Version: '5.0.60sp1-enterprise-gpl' socket: '/tmp/mysql4406.sock' port: 4406 MySQL Enterprise Server (GPL) 081014 23:06:23 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=localhost-relay-bin' to avoid this problem. 081014 23:06:43 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:3306', replication started in log 'master-bin.000001' at position 2081484 081014 23:06:43 [Note] Slave SQL thread initialized, starting replication in log 'master-bin.000001' at position 2081484, relay log './localhost-relay-bin.000001' position: 4 Number of processes running now: 0 081014 23:07:55 mysqld restarted InnoDB: Log scan progressed past the checkpoint lsn 0 11935295 081014 23:07:55 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... InnoDB: Doing recovery: scanned up to log sequence number 0 12840023 081014 23:07:55 InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed InnoDB: In a MySQL replication slave the last master binlog file InnoDB: position 0 4194385, file name master-bin.000001 InnoDB: Last MySQL binlog file position 0 2081484, file name ./master-bin.000001 081014 23:07:56 InnoDB: Started; log sequence number 0 12840023 081014 23:07:56 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=localhost-relay-bin' to avoid this problem. 081014 23:07:56 [Note] /usr/local/mysql/bin/mysqld: ready for connections. Version: '5.0.60sp1-enterprise-gpl' socket: '/tmp/mysql4406.sock' port: 4406 MySQL Enterprise Server (GPL)
[14 Oct 2008 2:30]
Adam Dixon
If the above is not clear or quick, basically; just before the slave is kill -9; Master reports: | master-bin.000001 | 4160333 | Slave reports: Master_Log_File: master-bin.000001 Read_Master_Log_Pos: 4160333 kill -9 then innodb reports in err log; InnoDB: In a MySQL replication slave the last master binlog file InnoDB: position 0 4194385, file name master-bin.000001 InnoDB: Last MySQL binlog file position 0 2081484, file name ./master-bin.000001
[15 Oct 2008 13:05]
Vasil Dimov
Adam, I tried to reproduce what you observed. First there is one problem with your scenario, which is probably not relevant but anyway: 1. Run infinite perl script against server1 2. Take snapshot of server1's filesystem (the perl script is still running on server1 after the snapshot has been taken) 3. Create server2 from the snapshot files 4. Execute "CHANGE MASTER ... MASTER_LOG_POS=NNN;" on server2, where NNN is obtained by executing "SHOW MASTER STATUS" on server1 _after_ the snapshot has been taken (2.). This means that the position on server1 (and thus NNN) is greater than what server2 has in its files from the snapshot. You should obtain master position (SHOW MASTER STATUS) _before_ taking the snapshot. This is described in http://dev.mysql.com/doc/refman/5.0/en/replication-howto-existingdata.html in step 2: "If your server is already correctly configured, obtain the master status (see Section 15.1.1.4, Obtaining the Master Replication Information) AND THEN use mysqldump to take a snapshot". Now to the actual problem: * infinite_insert.pl is running against the master * get master and slave status * kill -9 the slave the result is this: from master: +-------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +-------------------+----------+--------------+------------------+ | master-bin.000003 | 2558594 | | | +-------------------+----------+--------------+------------------+ from slave: Master_Log_File: master-bin.000003 Read_Master_Log_Pos: 2558594 when slave is restarted: InnoDB: In a MySQL replication slave the last master binlog file InnoDB: position 0 2558567, file name master-bin.000003 The position InnoDB prints is the position of the last committed transaction in InnoDB, so it is normal that it is smaller than the position reported by SHOW SLAVE STATUS just before the crash (kill -9) - InnoDB did not commit some transactions. When I look at your output I see that the position reported by InnoDB (4194385) is greater than the position reported by SHOW SLAVE STATUS before the crash (4160333). This can happen if some activity occurred after SHOW SLAVE STATUS and before kill -9. Provided that infinite_insert.pl is running at that moment, this is quite likely to happen. This is not a bug. Notice that the difference, reported originally by Baron, is huge: 115 vs 643703376.
[24 Oct 2008 13:08]
Heikki Tuuri
Adam, I agree with Vasil that the printout looks sensible after the kill -9. What is the precise problem in it? Regards, Heikki
[14 Nov 2008 0:47]
Luis Soares
Hi all, I did a quick investigation on this bug. * I performed some testing (with version 5.0.40 and 5.0.68): 1. Start master and create table: "CREATE TABLE t1 (a INT) engine=innodb" 2. Populate the table with 100 transactions (each with one insert). 3. Started the slave; 4. Connected the slave and let it apply master's binlog. 5. kill -9 the slave; 6. Restarted the slave; ** Here is what happened in version 5.0.40: - Some events written to the binlog showed small values for the end_log_pos (the transaction inserts); # at 18614 #081113 23:12:15 server id 1 end_log_pos 18682 Query thread_id=3 exec_time=0 error_code=0 SET TIMESTAMP=1226617935/*!*/; BEGIN /*!*/; # at 18682 #081113 23:12:15 server id 1 end_log_pos 89 Query thread_id=3 exec_time=0 error_code=0 SET TIMESTAMP=1226617935/*!*/; INSERT INTO t1 VALUES (99) /*!*/; # at 18771 #081113 23:12:15 server id 1 end_log_pos 18798 Xid = 208 COMMIT/*!*/; - Insert events were inside a BEGIN and COMMIT, and seemed that the end_log_pos was set as relative to the BEGIN event offset (always showed *89* as the end_log_pos); - The transaction BEGIN/COMMIT showed an absolute offset. - In the slave, the relay log presented the same offsets as the master binlog; - Once I killed and restarted the slave innodb would print (notice the *89* there): InnoDB: In a MySQL replication slave the last master binlog file InnoDB: position 0 89, file name gaia-bin.000001 ** Here is what happened in version 5.0.68: - All events written to master binary log had absolute position; # at 18614 #081113 23:57:55 server id 1 end_log_pos 18682 Query thread_id=3 exec_time=0 error_code=0 SET TIMESTAMP=1226620675/*!*/; BEGIN /*!*/; # at 18682 #081113 23:57:55 server id 1 end_log_pos 18771 Query thread_id=3 exec_time=0 error_code=0 SET TIMESTAMP=1226620675/*!*/; INSERT INTO t1 VALUES (99) /*!*/; # at 18771 #081113 23:57:55 server id 1 end_log_pos 18798 Xid = 208 COMMIT/*!*/; - The slave relay log presented the same type of offsets - Once I killed and restarted the slave the innodb would print (note that there is *no 89*): InnoDB: In a MySQL replication slave the last master binlog file InnoDB: position 0 18771, file name gaia-bin.000001 Just so you know, I went a little into the source code and found that: * The value (master_log_pos) that is injected into innodb comes from: rli->future_group_master_log_pos= log_pos; * This value is set in three different places in the log_event.cc: - Query_log_event::exec_event - Load_log_event::exec_event - Execute_load_log_event::exec_event * It is handed to innodb in the *ha_innodb.cc* in function *innobase_commit_low*: 1531 1532 if (thd && thd->slave_thread) { 1533 /* Update the replication position info inside InnoDB */ 1534 1535 trx->mysql_master_log_file_name 1536 = active_mi->rli.group_master_log_name; 1537 trx->mysql_master_log_pos = ((ib_longlong) 1538 active_mi->rli.future_group_master_log_pos); 1539 } * Innodb seems to pick this at trx0trx.c in function trx_commit_off_kernel: 805 806 if (trx->mysql_master_log_file_name[0] != '\0') { 807 /* This database server is a MySQL replication slave */ 808 trx_sys_update_mysql_binlog_offset( 809 trx->mysql_master_log_file_name, 810 trx->mysql_master_log_pos, 811 TRX_SYS_MYSQL_MASTER_LOG_INFO, &mtr); 812 } Finally, a few thoughts: 1. This might explain why Baron was getting the strange (small) offsets! 2. This could also explain why Vasil was not able to reproduce it (I think Vasil was using 5.0.68). 3. The strange (small) offsets issue seems fixed in the latest version of 5.0 .
[25 Nov 2008 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[24 Jan 2009 21:49]
Mats Kindahl
This bug is a duplicate of BUG#22540, which was fixed in 5.0.46.
[27 Jan 2009 3:30]
Baron Schwartz
Are you sure? I guess the code is pretty intricate here, but it doesn't really look like a duplicate to me. Can you explain how the code fixed in that other bug could cause the *massively* out of whack binlog positions I was seeing? I suppose it could be true, since the servers I originally noticed this bug on did a lot of really huge LOAD DATA INFILE.
[27 Jan 2009 9:38]
Mats Kindahl
Hi Baron! Yes, as far as I can tell, it is a duplicate. There are two ways that an event can make it to the binary log: either directly or through the transaction cache. For non-transactional tables, the practice has been that they go directly to the binary log or through the transaction cache (but see BUG#40116). Prior to the patch for BUG#22540, the binlog positions for an event were given relative the start of the IO_CACHE it was written to, meaning that when it was written to the transaction cache, the binlog position would be relative the start of the transaction (that is, usually quite small) and not relative the start of the file. The BUG#22540 fixes this by patching the binlog position of the events when writing the transaction cache to the binary log. However, this alone does not explain why the slave does not roll back the *transaction* when it is being restored from the snapshot, but note that AUTOCOMMIT=0 was used when writing the transaction on the master. This means that the transaction was written to the binary log without a BEGIN, just a COMMIT. Since the slave thread runs with AUTOCOMMIT=1 in 5.0 (a bug that is fixed in 5.1, where we replicate the AUTOCOMMIT flag), this means that each statement would be executed as a separate transaction, so when the slave "crashed" (this is how it appears when the snapshot is restored), it would roll back to a position that was not a start of a transaction. This is fixed by the patch for BUG#26395. By fixing BUG#22540, the positions will be correct, and the new slave will be able to recover from the correct position, so in that sense, it fixes the problem described in this bug report. Note, however, that it will not start from the beginning of the transaction when recovering. Assuming that there are no crashes and that the slave is used in a pure scale-out scenario, the end result will be the same (the binary log is a serial history of the statements executed on the master, so it is already serialized correctly). However, intermediate states are still visible to other transactions and it is possible that a reader sees an inconsistent state. This is what BUG#26395 fixes.