Bug #70564 future_group_master_log_pos not set properly
Submitted: 9 Oct 2013 15:40 Modified: 17 Feb 2014 17:29
Reporter: Inaam Rana (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.5 OS:Any
Assigned to: CPU Architecture:Any

[9 Oct 2013 15:40] Inaam Rana
Description:
Query_log_event::do_apply_event() has following:

  /*
    InnoDB internally stores the master log position it has executed so far,
    i.e. the position just after the COMMIT event.
    When InnoDB will want to store, the positions in rli won't have
    been updated yet, so group_master_log_* will point to old BEGIN
    and event_master_log* will point to the beginning of current COMMIT.
    But log_pos of the COMMIT Query event is what we want, i.e. the pos of the
    END of the current log event (COMMIT). We save it in rli so that InnoDB can
    access it.
  */
  const_cast<Relay_log_info*>(rli)->future_group_master_log_pos= log_pos;

However, log_pos at this point is still pointing to the start of the event. It appears COMMIT is not a Query_log_event. Instead, it is treated as Xid_log_event. I have checked that for RBR at least.

How to repeat:
see code

Suggested fix:
Add following to the Xid_log_event::do_apply_event():

--- a/sql/log_event.cc
+++ b/sql/log_event.cc
@@ -5692,6 +5692,19 @@ int Xid_log_event::do_apply_event(Relay_log_info const *rli)
   /* For a slave Xid_log_event is COMMIT */
   general_log_print(thd, COM_QUERY,
                     "COMMIT /* implicit, from Xid_log_event */");
+
+  /*
+    InnoDB internally stores the master log position it has executed so far,
+    i.e. the position just after the COMMIT event.
+    When InnoDB will want to store, the positions in rli won't have
+    been updated yet, so group_master_log_* will point to old BEGIN
+    and event_master_log* will point to the beginning of current COMMIT.
+    But log_pos of the COMMIT Query event is what we want, i.e. the pos of the
+    END of the current log event (COMMIT). We save it in rli so that InnoDB can
+    access it.
+  */
+  const_cast<Relay_log_info*>(rli)->future_group_master_log_pos= log_pos;
[9 Oct 2013 15:44] Inaam Rana
This bug is probably not much of an issue for stock msqyl as future_group_master_log_pos is not really used. But for those who are using it, like twitter, where it is stored in trx_sys_header page on the slave as master binlog position upto which SQL thread has already worked, it results in always being one event behind the intended position i.e.: when we commit an event we want to store the end co-ordinates of that event instead of start co-ordinates.

I understand that in 5.6 with crash save slaves this may not be an issue as storing master binlog position becomes redundant but for 5.5 it is still a valid concern.
[10 Oct 2013 11:06] MySQL Verification Team
Hello Inaam,

Thank you for the bug report and contribution.

Thanks,
Umesh
[22 Oct 2013 17:45] Luis Soares
Test Case
---------

I used this very small test case to investigate this issue.

--source include/master-slave.inc
CREATE TABLE t1 (a INT) Engine=InnoDB;
INSERT INTO t1 VALUES (1);
DROP TABLE t1;
--source include/rpl_end.inc

Context
-------

Some context to begin with.

Xid log events are logged for transactions that modify XA capable
engines (on recovery, the binlog is read to collect the xids and then
ha_recover taking as argument the xids read).  (see
TC_LOG_BINLOG::recover).

When the slave applies this event it commits the current
transaction. (Xid_log_event::do_apply_event calls trans_commit, which
calls ha_commit_trans).

Analysis
--------

Looking at the code, one finds that future_group_master_log_pos is
updated for every:

  1. Query_log_event
  2. Load_log_event::do_apply_event
  3. Execute_load_log_event::do_apply_event

The future_group_master_log_pos holds the position of the last event
that caused a transaction commit. The position it stores is the end
position of the event. Further explanation can be found on rpl_rli.h.

Although not entirely relevant to this case, the list of events
that advance the future_group_master_log_pos contain two more events:
Load_log_event and Execute_load_log_event. Apparently these also
update the position since NDB will autocommit such statements.
More details at:
http://dev.mysql.com/doc/refman/5.5/en/implicit-commit.html
http://bugs.mysql.com/bug.php?id=11151

Observations
------------

O1. The Xid_log_event::do_apply_event does not update the
    future_group_master_log_pos as do the other events that also cause
    a transaction commit.

O2. log_pos is correct and not pointing to the start of the event,
    but to the end.

    "log_pos" is the position of the end of the event that is being
    processed. It is part of each event's common header event. As part
    of the common header, every event contains this information.

    I double checked and inspecting the value of log_pos for the test
    case given above, I can find that log_pos is the end position of
    the Xid_log_event, thus it marks the beginnig of the DROP TABLE's
    Query_log_event:

    mysqlbinlog shows this:

    # at 363
    #131022 18:16:54 server id 1  end_log_pos 390 	Xid = 125
    COMMIT/*!*/;
    # at 390
    #131022 18:16:54 server id 1  end_log_pos 494
	Query	thread_id=6	exec_time=0	error_code=0
    SET TIMESTAMP=1382462214/*!*/;
    DROP TABLE `t1` /* generated by server */

    Having a close look at the execution flow using gdb:

    (gdb) b Xid_log_event::do_apply_event
    (gdb) c
    (gdb) p log_pos
    $1 = 390

    I also double checked for Query_log_event:

    Breakpoint 2, Query_log_event::do_apply_event 
    (this=0x7fffd400b380, rli=0x1a6acb0, query_arg=0x7fffd4010ea9 
    "DROP TABLE `t1` /* generated by server */", q_len_arg=41)
    at /home/lsoares/Workspace/bzr/work/bugfixing/17587618/mysql-5.5/sql/log_event.cc:3263
    3263    {
    (gdb) p log_pos
    $1 = 494

    and with mysqlbinlog:

    # at 390
    #131022 18:31:01 server id 1  end_log_pos 494
	Query	thread_id=6	exec_time=0	error_code=0
    SET TIMESTAMP=1382463061/*!*/;
    DROP TABLE `t1` /* generated by server */

    Notice that log_pos (494) matches end_log_pos (494).

Conclusions
-----------

C1. The log_pos is correct in the Xid_log_event and Query_log_event.

C2. Xid_log_event seems to be missing an update to
    future_group_master_log_pos.
[17 Feb 2014 17:29] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

Fixed in MySQL 5.7.4. The fix for this issues removes dead code and thus has no user-facing effects to document.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html