MySQL Bugs: #85371: Memory leak in multi-source replication when binlog_rows_query_log

Bug #85371	Memory leak in multi-source replication when binlog_rows_query_log_events=1
Submitted:	8 Mar 2017 10:46	Modified:	24 Nov 2017 8:42
Reporter:	Przemyslaw Malkowski	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S2 (Serious)
Version:	5.7.17	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	multi source replication

Description:
Slave SQL thread leaks memory when replicating using two channels from master-master pair. Not using binlog_rows_query_log_events seems to allow to avoid the problem.

SQL thread restart doesn't release the memory.

Tested on MySQL Community 5.7.17.

Test case based on multi_source.sh script which creates four MSR nodes, but only three are needed to reproduce, with MSR enabled on one. The more writes, the bigger memory usage on third node.

Credits to Trey Raymond who first noticed the leak in part of the bug #85034.

How to repeat:
$ ./multi_source.sh mysql5.7.17 mysql ALL-MASTERS
$ cd multi_msb_mysql5_7_17

$ ./n1 -e "stop slave for channel 'node3'; reset slave all for channel 'node3';"
$ ./n1 -e "stop slave for channel 'node4'; reset slave all for channel 'node4';"
$ ./n2 -e "stop slave for channel 'node3'; reset slave all for channel 'node3';"
$ ./n2 -e "stop slave for channel 'node4'; reset slave all for channel 'node4';"
$ ./n3 -e "stop slave for channel 'node4'; reset slave all for channel 'node4';"
$ ./n4 -e "stop slave for channel 'node3'; reset slave all for channel 'node3';"
$ ./n4 -e "stop slave for channel 'node2'; reset slave all for channel 'node2';"
$ ./n4 -e "stop slave for channel 'node1'; reset slave all for channel 'node1';"

$ for i in {1..3}; do echo "binlog_rows_query_log_events=1" >> node$i/my.sandbox.cnf ; done
$ for i in {1..3}; do echo "performance-schema-instrument='memory%=ON' " >> node$i/my.sandbox.cnf ; done
$ for i in {1..3}; do echo "log_slave_updates=1" >> node$i/my.sandbox.cnf ; done
$ ./restart_all

$ ./use_all "SELECT * FROM performance_schema.replication_connection_status\G";
# server: 1: 
*************************** 1. row ***************************
             CHANNEL_NAME: node2
               GROUP_NAME: 
              SOURCE_UUID: 00014419-2222-2222-2222-222222222222
                THREAD_ID: 25
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 1
 LAST_HEARTBEAT_TIMESTAMP: 2017-03-08 05:13:55
 RECEIVED_TRANSACTION_SET: 
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE: 
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
# server: 2: 
*************************** 1. row ***************************
             CHANNEL_NAME: node1
               GROUP_NAME: 
              SOURCE_UUID: 00014418-1111-1111-1111-111111111111
                THREAD_ID: 26
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 3
 LAST_HEARTBEAT_TIMESTAMP: 2017-03-08 05:13:56
 RECEIVED_TRANSACTION_SET: 
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE: 
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
# server: 3: 
*************************** 1. row ***************************
             CHANNEL_NAME: node1
               GROUP_NAME: 
              SOURCE_UUID: 00014418-1111-1111-1111-111111111111
                THREAD_ID: 25
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 3
 LAST_HEARTBEAT_TIMESTAMP: 2017-03-08 05:13:57
 RECEIVED_TRANSACTION_SET: 
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE: 
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 2. row ***************************
             CHANNEL_NAME: node2
               GROUP_NAME: 
              SOURCE_UUID: 00014419-2222-2222-2222-222222222222
                THREAD_ID: 27
            SERVICE_STATE: ON
COUNT_RECEIVED_HEARTBEATS: 3
 LAST_HEARTBEAT_TIMESTAMP: 2017-03-08 05:13:57
 RECEIVED_TRANSACTION_SET: 
        LAST_ERROR_NUMBER: 0
       LAST_ERROR_MESSAGE: 
     LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
# server: 4: 

$ ./use_all "select * from sys.memory_global_total";
# server: 1: 
total_allocated
324.63 MiB
# server: 2: 
total_allocated
324.63 MiB
# server: 3: 
total_allocated
324.70 MiB

$ ps aux|grep multi_msb_mysql5_7_17|grep .sock|awk '{print "vsz: "$5/1024" MB","rss: "$6/1024" MB",$19}'
vsz: 1471.08 MB rss: 172.984 MB --port=14418
vsz: 1471.05 MB rss: 172.984 MB --port=14419
vsz: 1471.07 MB rss: 173.238 MB --port=14420

$ ./n1 -e "create database sbtest1;"

$ sysbench --num-threads=16 --max-requests=300000 --db-driver=mysql --test=oltp --mysql-table-engine=InnoDB --mysql-db=sbtest1 --mysql-user=msandbox --mysql-password=msandbox --oltp-table-size=1000000 --mysql-socket=/tmp/mysql_sandbox14418.sock --test=/usr/share/doc/sysbench/tests/db/oltp.lua prepare

-- wait till replication catches up

$ ps aux|grep multi_msb_mysql5_7_17|grep .sock|awk '{print "vsz: "$5/1024" MB","rss: "$6/1024" MB",$19}'
vsz: 2882.89 MB rss: 346.254 MB --port=14418
vsz: 1727.05 MB rss: 301.184 MB --port=14419
vsz: 2932.16 MB rss: 1513.23 MB --port=14420

$ ./use_all "select * from sys.memory_global_total";
# server: 1: 
total_allocated
361.95 MiB
# server: 2: 
total_allocated
335.64 MiB
# server: 3: 
total_allocated
1.35 GiB

Suggested fix:
Avoid memory leak.

Forgot to add sysbench "run" test, after the prepare.

Hello Przemyslaw,

Thank you for the report and test case.
Verified as described with 5.7.17 build.

Thanks,
Umesh

test results

Attachment: 85371.5.7.17.results (application/octet-stream, text), 17.33 KiB.

We have the similar problem in multi-source replication(with two master);
And not sure wether "set global binlog_rows_query_log_events = OFF" works(set to OFF now and wait for results) : (

checks in ps by "select * from memory_summary_by_thread_by_event_name order by CURRENT_NUMBER_OF_BYTES_USED desc limit 5\G" after enabled memory monitor;

start multi-source replication after restart mysql(RO):
*************************** 2. row ***************************
                   THREAD_ID: 18189
                  EVENT_NAME: memory/sql/Log_event
                 COUNT_ALLOC: 521692
                  COUNT_FREE: 0
   SUM_NUMBER_OF_BYTES_ALLOC: 117988604
    SUM_NUMBER_OF_BYTES_FREE: 0
              LOW_COUNT_USED: 111993
          CURRENT_COUNT_USED: 521692
             HIGH_COUNT_USED: 521692
    LOW_NUMBER_OF_BYTES_USED: 25286276
CURRENT_NUMBER_OF_BYTES_USED: 117988604
   HIGH_NUMBER_OF_BYTES_USED: 117988604
*************************** 3. row ***************************
                   THREAD_ID: 18183
                  EVENT_NAME: memory/sql/Log_event
                 COUNT_ALLOC: 521426
                  COUNT_FREE: 0
   SUM_NUMBER_OF_BYTES_ALLOC: 117732632
    SUM_NUMBER_OF_BYTES_FREE: 0
              LOW_COUNT_USED: 111583
          CURRENT_COUNT_USED: 521426
             HIGH_COUNT_USED: 521426
    LOW_NUMBER_OF_BYTES_USED: 25154914
CURRENT_NUMBER_OF_BYTES_USED: 117732632
   HIGH_NUMBER_OF_BYTES_USED: 117732632

two hours later:
*************************** 1. row ***************************
                   THREAD_ID: 18189
                  EVENT_NAME: memory/sql/Log_event
                 COUNT_ALLOC: 2297022
                  COUNT_FREE: 0
   SUM_NUMBER_OF_BYTES_ALLOC: 525744164
    SUM_NUMBER_OF_BYTES_FREE: 0
              LOW_COUNT_USED: 111993
          CURRENT_COUNT_USED: 2297022
             HIGH_COUNT_USED: 2297022
    LOW_NUMBER_OF_BYTES_USED: 25286276
CURRENT_NUMBER_OF_BYTES_USED: 525744164
   HIGH_NUMBER_OF_BYTES_USED: 525744164
*************************** 2. row ***************************
                   THREAD_ID: 18183
                  EVENT_NAME: memory/sql/Log_event
                 COUNT_ALLOC: 2296412
                  COUNT_FREE: 0
   SUM_NUMBER_OF_BYTES_ALLOC: 524600639
    SUM_NUMBER_OF_BYTES_FREE: 0
              LOW_COUNT_USED: 111583
          CURRENT_COUNT_USED: 2296412
             HIGH_COUNT_USED: 2296412
    LOW_NUMBER_OF_BYTES_USED: 25154914
CURRENT_NUMBER_OF_BYTES_USED: 524600639
   HIGH_NUMBER_OF_BYTES_USED: 524600639

thread info  in sys.session:
mysql> select thd_id,conn_id,user,command,state,current_memory from session where thd_id in (18183,18189)\G
*************************** 1. row ***************************
        thd_id: 18183
       conn_id: 18158
          user: sql/slave_sql
       command: Sleep
         state: Slave has read all relay log; waiting for more updates
current_memory: 532.28 MiB
*************************** 2. row ***************************
        thd_id: 18189
       conn_id: 18164
          user: sql/slave_sql
       command: Sleep
         state: Slave has read all relay log; waiting for more updates
current_memory: 533.50 MiB
2 rows in set (0.10 sec)

The following analysis is for 5.7:

1) When binlog_rows_query_log_events is enabled, Rows_query_log_event
   is created, written to binary log on master prior Rows_log_event.
    
2) Slave is reading this event from relay log and applying it,
   see Rows_query_log_event::do_apply_event(). When the event is applied
   the event itself is stored in Relay_log_info::rows_query_ev:
    
    int Rows_query_log_event::do_apply_event(Relay_log_info const *rli)
    {
      DBUG_ENTER("Rows_query_log_event::do_apply_event");
      DBUG_ASSERT(rli->info_thd == thd);
      /* Set query for writing Rows_query log event into binlog later.*/
      thd->set_query(m_rows_query, strlen(m_rows_query));
    
      DBUG_ASSERT(rli->rows_query_ev == NULL);
    
      const_cast<Relay_log_info*>(rli)->rows_query_ev= this;
      /* Tell worker not to free the event */
      worker= NULL;
      DBUG_RETURN(0);
    }
    
3) Usually Relay_log_info::rows_query_ev is removed after Rows_log_event
   execution with the following stack trace:
    
        at ./sql/log_event.h:3795
        at ./sql/rpl_rli.cc:1777
        at ./sql/log_event.cc:11175
        at ./sql/rpl_slave.cc:4750
    
   But if we look into Rows_log_event::do_apply_event() the function
   rows_event_stmt_cleanup() is called at the end:
    
    int Rows_log_event::do_apply_event(Relay_log_info const *rli)
    {
    ...
    end:
      if (get_flags(STMT_END_F))
      {
        if((error= rows_event_stmt_cleanup(rli, thd)))
        {
          ...
        }
        ...
      }
    ...
    }
    
   4) But when the statement must be skipped because it has been already
   applied from another channel, Rows_log_event::do_apply_event() exits before
   rows_event_stmt_cleanup() call. See the following code:
    
    int Rows_log_event::do_apply_event(Relay_log_info const *rli)
    {
    ...
        enum_gtid_statement_status state= gtid_pre_statement_checks(thd);
        if (state == GTID_STATEMENT_EXECUTE)
        {
          if (gtid_pre_statement_post_implicit_commit_checks(thd))
            state= GTID_STATEMENT_CANCEL;
        }
    
        if (state == GTID_STATEMENT_CANCEL)
        {
          uint error= thd->get_stmt_da()->mysql_errno();
          DBUG_ASSERT(error != 0);
          rli->report(ERROR_LEVEL, error,
                      "Error executing row event: '%s'",
                      thd->get_stmt_da()->message_text());
          thd->is_slave_error= 1;
          DBUG_RETURN(-1);
        }
        else if (state == GTID_STATEMENT_SKIP)
          DBUG_RETURN(0);
    ...
    end:
      if (get_flags(STMT_END_F))
      {
        if((error= rows_event_stmt_cleanup(rli, thd)))
        {
          ...
        }
        ...
      }
    ...
    }
    
   So in this case rli->rows_query_ev will not be deleted. In debug build such
   case will lead to crash due to the corresponding assert in
   Rows_query_log_event::do_apply_event():
    
    int Rows_query_log_event::do_apply_event(Relay_log_info const *rli)
    {
    ...
      DBUG_ASSERT(rli->rows_query_ev == NULL);
    ...
    }
    
   But for release build there will be just memory leak.
    
   How to fix:
    
   delete Rows_query_log_event instance if the next Rows_log_event instance read
   from relay log must be skipped.

delete Rows_query_log_event instance if the next Rows_log_event instance read from relay log must be skipped (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: lp-1667307.patch (text/x-patch), 6.09 KiB.

Bug 85371 fix for 8.0.2 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug85371-8.0.2.patch (application/octet-stream, text), 9.34 KiB.

Posted by developer:
 
Changelog entry added for MySQL 5.7.21 and 8.0.4:
In a multi-source replication topology, a memory leak could occur on the slave when binlog_rows_query_log_events was enabled on the master, and a statement already applied from another channel was skipped on the slave. In this situation, the instance of the Rows_query log event stored on the slave was not being deleted. The log event instance is now cleaned up and the memory is freed. Thanks to Vlad Lesin for his contribution to the patch.

Bug#86248 has been marked as a duplicate of this bug.