MySQL Bugs: #82980: Multi-threaded slave leaks worker threads in case of thread create failure

Bug #82980	Multi-threaded slave leaks worker threads in case of thread create failure
Submitted:	14 Sep 2016 7:02	Modified:	14 Jul 2017 14:28
Reporter:	Laurynas Biveinis (OCA)	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S3 (Non-critical)
Version:	5.6	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	error handling, MTS, replication

Description:
If one adds SHOW PROCESSLIST to the end of rpl.mts_debug_slave, it will show an orphaned SQL worker thread that is there regardless that the slave is stopped. It is a thread created by mts_worker_thread_fails failure injection point testcase, which tries to set up MTS with two worker threads and simulates a pthread_create failure for one of them.

How to repeat:
The testcase reduces to

--source include/not_gtid_enabled.inc
--source include/master-slave.inc
--source include/have_debug.inc
--source include/have_innodb.inc

#
# Bug#12964649 MTS memory not released
#
# Originally reported when opt_slave_checkpoint_group is set to be greater
# than sizeof(BLOB) * 8 - 1 Workers initialization code fails.
# VALID_RANGE() for opt_slave_checkpoint_group is fixed in bug12979420.
#
# The current bug test simulates a failure to start by a Worker
# thread, checks the error, and verifies that all memory is released
# (valgrind proves, the user can watch top report).
#

connection slave;

call mtr.add_suppression('Error reading slave worker configuration');
call mtr.add_suppression('Failed during slave workers initialization');
call mtr.add_suppression('Failed during slave worker thread create');

source include/stop_slave.inc;
set @save.slave_checkpoint_group= @@global.slave_checkpoint_group;
set @save.slave_parallel_workers= @@global.slave_parallel_workers;

# The following assignment can work only with debug builds.
# It sets opt_slave_checkpoint_group to one more to its VALID_RANGE() maximum.
# set @@global.slave_checkpoint_group= 524280 + 1;

set @save.debug= @@global.debug;
SET @@global.debug= "d,mts_worker_thread_fails";
set @@global.slave_parallel_workers= 2;

start slave sql_thread;
--let $slave_sql_errno= 1593
source include/wait_for_slave_sql_error.inc;

set @@global.slave_checkpoint_group= @save.slave_checkpoint_group;
set @@global.slave_parallel_workers= @save.slave_parallel_workers;
set @@global.debug= @save.debug;

source include/start_slave.inc;

--source include/rpl_end.inc

SHOW PROCESSLIST;

Suggested fix:
Remove early exits from slave_stop_workers?

Hello Laurynas,

Thank you for the report.

Thanks,
Umesh

-- 5.6.33

SHOW PROCESSLIST;
Id      User    Host    db      Command Time    State   Info
2       root    localhost       test    Query   0       init    SHOW PROCESSLIST
6       root    localhost:41632 test    Sleep   1               NULL
7       root    localhost:41633 test    Sleep   1               NULL
8       root    localhost:41636 NULL    Binlog Dump     1       Master has sent all binlog to slave; waiting for binlog to be updated   NULL
main.rpl_bug82980                        [ pass ]   1008

-- 5.7.15

SHOW PROCESSLIST;
Id      User    Host    db      Command Time    State   Info
3       root    localhost       test    Query   0       starting        SHOW PROCESSLIST
7       root    localhost:42379 test    Sleep   1               NULL
8       root    localhost:42380 test    Sleep   1               NULL
9       root    localhost:42384 NULL    Binlog Dump     1       Master has sent all binlog to slave; waiting for more updates   NULL
main.rpl_bug82980                        [ fail ]  Found warnings/errors in server log file!
        Test ended at 2016-09-16 10:42:02
line
2016-09-16T08:42:01.394622Z 9 [ERROR] Failed during slave worker thread creation for channel '' (errno= 0)
^ Found warnings in /export/umesh/server/binaries/GABuilds/mysql-advanced-5.7.15/mysql-test/var/log/mysqld.2.err
ok

 - the logfile can be found in '/export/umesh/server/binaries/GABuilds/mysql-advanced-5.7.15/mysql-test/var/log/main.rpl_bug82980/rpl_bug82980.log'

Bug 82980 fix for 5.6 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug82980-5.6.patch (application/octet-stream, text), 822 bytes.

Bug 82980 fix for 5.7 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug82980-5.7.patch (application/octet-stream, text), 828 bytes.

Bug 82980 fix for 8.0 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug82980-8.0.patch (application/octet-stream, text), 842 bytes.

Hi Laurynas,

I tried reproducing this bug on MySQL-5.6 but couldn't, it passes all the time.
With MySQL-5.7 I could reproduce the error, with the test given by you.

This was the error in mysqld.2.err:

2017-03-21T09:41:43.359887Z 10 [ERROR] Failed during slave worker thread creation for channel '' (errno= 0)
2017-03-21T09:41:43.359974Z 10 [ERROR] Slave SQL for channel '': Fatal error: Failed during slave workers initialization, Error_code: 1593

But this error persists even with the patch you have submitted. I will try to look more closely at the errors.

Hello Neha -

I will review too and should get back to you soon

My "SHOW PROCESSLIST" command added in the original report is wrong, as it executes on the master instead of the slave.

With current public 5.6 (5.6.35), the reduced testcase shows the following PROCESSLIST on slave at the end:

SHOW PROCESSLIST;
Id	User	Host	db	Command	Time	State	Info
6	root	localhost:55768	test	Query	0	init	SHOW PROCESSLIST
7	root	localhost:55769	test	Sleep	0		NULL
9	system user		NULL	Connect	0	Waiting for an event from Coordinator	NULL

If the testcase is modified to comment out "SET @@global.debug= "d,mts_worker_thread_fails";", wait for slave SQL error, and a include/stop_slave.inc is added before final include/start_slave.inc, then the slave  processlist at the end becomes:

SHOW PROCESSLIST;
Id	User	Host	db	Command	Time	State	Info
6	root	localhost:55742	test	Query	0	init	SHOW PROCESSLIST
7	root	localhost:55743	test	Sleep	1		NULL

i.e. one less thread

And with the patch applied, the reduced testcase processlist has two threads:

SHOW PROCESSLIST;
Id	User	Host	db	Command	Time	State	Info
6	root	localhost:55881	test	Query	0	init	SHOW PROCESSLIST
7	root	localhost:55882	test	Sleep	1		NULL

But now I'm suspecting that the patch traded thread leak for a memory leak, checking.

Indeed, the patch causes

rpl.rpl_mts_debug 'mix'                  [ pass ]  54107
rpl.rpl_mts_debug 'row'                  [ skipped ]  Doesn't support --binlog-format='row'
rpl.rpl_mts_debug 'stmt'                 [ skipped ]  Doesn't support --binlog-format='statement'
worker[1] Valgrind report from /home/laurynas/obj-5.6-valgrind/mysql-test/var/log/mysqld.2.err after tests:
 rpl.rpl_mts_debug
--------------------------------------------------------------------------
HEAP SUMMARY:
    in use at exit: 68,054,661 bytes in 1,230 blocks
  total heap usage: 68,147 allocs, 66,917 frees, 368,180,952 bytes allocated

6 bytes in 1 blocks are possibly lost in loss record 1 of 468
   at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
   by 0xA3184B: my_malloc (my_malloc.c:38)
   by 0xA07841: Rpl_info_table::Rpl_info_table(unsigned int, char const*, char const*) (rpl_info_table.cc:28)
   by 0x9FD4FD: Rpl_info_factory::init_repositories(Rpl_info_factory::struct_table_data, Rpl_info_factory::struct_file_data, unsigned int, unsigned int, Rpl_info_handler**, Rpl_info_handler**, char const**) (rpl_info_factory.cc:849)
   by 0x9FC7D1: Rpl_info_factory::create_worker(unsigned int, unsigned int, Relay_log_info*, bool) (rpl_info_factory.cc:404)
   by 0x9E5E29: slave_start_single_worker(Relay_log_info*, unsigned long) (rpl_slave.cc:5806)
   by 0x9E6351: slave_start_workers(Relay_log_info*, unsigned long, bool*) (rpl_slave.cc:5932)
   by 0x9E6D61: handle_slave_sql (rpl_slave.cc:6139)
   by 0xE00DB0: pfs_spawn_thread (pfs.cc:1860)
   by 0x547E6C9: start_thread (pthread_create.c:333)
   by 0x60450AE: clone (clone.S:105)
...
131,072 bytes in 1 blocks are possibly lost in loss record 437 of 468
   at 0x4C2CB3F: malloc (vg_replace_malloc.c:299)
   by 0xA3184B: my_malloc (my_malloc.c:38)
   by 0xA0B167: init_dynamic_array2 (array.c:70)
   by 0x9FF5EF: Slave_worker::init_worker(Relay_log_info*, unsigned long) (rpl_rli_pdb.cc:282)
   by 0x9E5E6F: slave_start_single_worker(Relay_log_info*, unsigned long) (rpl_slave.cc:5813)
   by 0x9E6351: slave_start_workers(Relay_log_info*, unsigned long, bool*) (rpl_slave.cc:5932)
   by 0x9E6D61: handle_slave_sql (rpl_slave.cc:6139)
   by 0xE00DB0: pfs_spawn_thread (pfs.cc:1860)
   by 0x547E6C9: start_thread (pthread_create.c:333)
   by 0x60450AE: clone (clone.S:105)

LEAK SUMMARY:
   definitely lost: 0 bytes in 0 blocks
   indirectly lost: 0 bytes in 0 blocks
     possibly lost: 146,792 bytes in 16 blocks
   still reachable: 0 bytes in 0 blocks
                      of which reachable via heuristic:
                        newarray           : 392 bytes in 1 blocks
        suppressed: 67,907,869 bytes in 1,214 blocks

For counts of detected and suppressed errors, rerun with: -v
ERROR SUMMARY: 16 errors from 16 contexts (suppressed: 228 from 228)

valgrind_report                          [ fail ]

I will submit a fixed version once it passes review/testing here.

Here's an improved 5.6 patch. Pasting inline as unable to upload new contributions here due to the status of existing ones.

diff --git a/sql/rpl_slave.cc b/sql/rpl_slave.cc
index 4a2d1d4..0d9861c 100644
--- a/sql/rpl_slave.cc
+++ b/sql/rpl_slave.cc
@@ -5962,10 +5962,6 @@ void slave_stop_workers(Relay_log_info *rli, bool *mts_inited)
 {
   int i;
   THD *thd= rli->info_thd;
-  if (!*mts_inited)
-    return;
-  else if (rli->slave_parallel_workers == 0)
-    goto end;
 
   /*
     If request for stop slave is received notify worker
@@ -6025,7 +6021,10 @@ void slave_stop_workers(Relay_log_info *rli, bool *mts_inited)
     mysql_mutex_unlock(&w->jobs_lock);
   }
 
-  if (thd->killed == THD::NOT_KILLED)
+  if (!*mts_inited)
+    return;
+
+  if (rli->slave_parallel_workers != 0 && thd->killed == THD::NOT_KILLED)
     (void) mts_checkpoint_routine(rli, 0, false, true/*need_data_lock=true*/); // TODO:consider to propagate an error out of the function
 
   for (i= rli->workers.elements - 1; i >= 0; i--)
@@ -6035,7 +6034,7 @@ void slave_stop_workers(Relay_log_info *rli, bool *mts_inited)
     delete_dynamic_element(&rli->workers, i);
     delete w;
   }
-  if (log_warnings > 1)
+  if (rli->slave_parallel_workers != 0 && log_warnings > 1)
     sql_print_information("Total MTS session statistics: "
                           "events processed = %llu; "
                           "worker queues filled over overrun level = %lu; "
@@ -6049,7 +6048,6 @@ void slave_stop_workers(Relay_log_info *rli, bool *mts_inited)
   DBUG_ASSERT(rli->pending_jobs == 0);
   DBUG_ASSERT(rli->mts_pending_jobs_size == 0);
 
-end:
   rli->mts_group_status= Relay_log_info::MTS_NOT_IN_GROUP;
   destroy_hash_workers(rli);
   delete rli->gaq;

5.7 version

diff --git a/sql/rpl_slave.cc b/sql/rpl_slave.cc
index fd00f1a..3e9dcad 100644
--- a/sql/rpl_slave.cc
+++ b/sql/rpl_slave.cc
@@ -6770,6 +6770,8 @@ int slave_start_single_worker(Relay_log_info *rli, ulong i)
 err:
   if (error && w)
   {
+    if (w->current_mts_submode)
+      delete w->current_mts_submode;
     delete w;
     /*
       Any failure after array inserted must follow with deletion
@@ -6896,11 +6898,6 @@ void slave_stop_workers(Relay_log_info *rli, bool *mts_inited)
 {
   THD *thd= rli->info_thd;
 
-  if (!*mts_inited)
-    return;
-  else if (rli->slave_parallel_workers == 0)
-    goto end;
-
   /*
     If request for stop slave is received notify worker
     to stop.
@@ -6985,7 +6982,10 @@ void slave_stop_workers(Relay_log_info *rli, bool *mts_inited)
     mysql_mutex_unlock(&w->jobs_lock);
   }
 
-  if (thd->killed == THD::NOT_KILLED)
+  if (!*mts_inited)
+    return;
+
+  if (rli->slave_parallel_workers != 0 && thd->killed == THD::NOT_KILLED)
     (void) mts_checkpoint_routine(rli, 0, false, true/*need_data_lock=true*/); // TODO:consider to propagate an error out of the function
 
   while (!rli->workers.empty())
@@ -7020,7 +7020,6 @@ void slave_stop_workers(Relay_log_info *rli, bool *mts_inited)
   DBUG_ASSERT(rli->pending_jobs == 0);
   DBUG_ASSERT(rli->mts_pending_jobs_size == 0);
 
-end:
   rli->mts_group_status= Relay_log_info::MTS_NOT_IN_GROUP;
   destroy_hash_workers(rli);
   delete rli->gaq;

A 8.0.0 delta patch, to be applied on the top of the previous 8.0.0 patch. It should be identical to the 5.7 version, give or take line offsets

diff --git a/sql/rpl_slave.cc b/sql/rpl_slave.cc
index a440efd..375a043 100644
--- a/sql/rpl_slave.cc
+++ b/sql/rpl_slave.cc
@@ -6767,6 +6767,8 @@ static int slave_start_single_worker(Relay_log_info *rli, ulong i)
 err:
   if (error && w)
   {
+    if (w->current_mts_submode)
+      delete w->current_mts_submode;
     delete w;
     /*
       Any failure after array inserted must follow with deletion
@@ -6981,10 +6983,8 @@ static void slave_stop_workers(Relay_log_info *rli, bool *mts_inited)
 
   if (!*mts_inited)
     return;
-  else if (rli->slave_parallel_workers == 0)
-    goto end;
 
-  if (thd->killed == THD::NOT_KILLED)
+  if (rli->slave_parallel_workers != 0 && thd->killed == THD::NOT_KILLED)
     (void) mts_checkpoint_routine(rli, 0, false, true/*need_data_lock=true*/); // TODO:consider to propagate an error out of the function
 
   while (!rli->workers.empty())
@@ -7019,7 +7019,6 @@ static void slave_stop_workers(Relay_log_info *rli, bool *mts_inited)
   DBUG_ASSERT(rli->pending_jobs == 0);
   DBUG_ASSERT(rli->mts_pending_jobs_size == 0);
 
-end:
   rli->mts_group_status= Relay_log_info::MTS_NOT_IN_GROUP;
   destroy_hash_workers(rli);
   delete rli->gaq;

Bug 82980 fix for 8.0.1

Attachment: bug82980-8.0.1.patch (application/octet-stream, text), 1.82 KiB.

Posted by developer:
 
Changelog entry added for MySQL 8.0.3, 5.7.20, and 5.6.38:
        In case of a failure while creating multiple slave applier
        worker threads, some threads would be left orphaned and their
        resources would not be collected. Thanks to Laurynas Biveinis
        for his contribution to fixing this bug.