Bug #22993 | Master hangs in SSL replication when the slave runs out of disk space | ||
---|---|---|---|
Submitted: | 4 Oct 2006 19:59 | Modified: | 13 Jul 2007 16:58 |
Reporter: | Harrison Fisk | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.0.42 | OS: | Linux (Linux 2.6.15 (Ubuntu 6.06 LTS)) |
Assigned to: | Assigned Account | CPU Architecture: | Any |
Tags: | bfsm_2007_03_01 |
[4 Oct 2006 19:59]
Harrison Fisk
[4 Oct 2006 20:12]
Harrison Fisk
While I was debugging the issue I got the following two backtraces, not sure if either of them are helpful or not: (gdb) bt #0 0xffffe410 in __kernel_vsyscall () #1 0xb7ee09f8 in send () from /lib/tls/i686/cmov/libpthread.so.0 #2 0x083f00a4 in yaSSL::Socket::send (this=0x8c3512c, buf=0x8c6aee8 "\027\003\001", sz=181, flags=0) at socket_wrapper.cpp:122 #3 0x083e59d1 in yaSSL::SSL::Send (this=0x8c347c8, buffer=0x8c6aee8 "\027\003\001", sz=181) at yassl_int.cpp:1013 #4 0x083ef5fc in yaSSL::sendData (ssl=@0x8c347c8, buffer=0x8c63030, sz=142) at handshake.cpp:892 #5 0x083d8685 in yaSSL_write (ssl=0x8c347c8, buffer=0x8c63030, sz=142) at ssl.cpp:211 #6 0x0839a852 in vio_ssl_write (vio=0x8c4d0b0, buf=0x8c63030 "\212", size=142) at viossl.c:104 #7 0x08172fa0 in net_real_write (net=0x8c4bf0c, packet=0x8c63030 "\212", len=142) at net_serv.cc:608 #8 0x081729f5 in net_flush (net=0x8c4bf0c) at net_serv.cc:333 #9 0x082715b7 in mysql_binlog_send (thd=0x8c4b700, log_ident=0x8c6cb20 "hfisk-desktop-bin.000008", pos=4057, flags=0) at sql_repl.cc:574 #10 0x0819069a in dispatch_command (command=COM_BINLOG_DUMP, thd=0x8c4b700, packet=0x8c63031 "", packet_length=35) at sql_class.h:725 #11 0x0818f903 in do_command (thd=0x8c4b700) at sql_parse.cc:1538 #12 0x0818ec72 in handle_one_connection (arg=0xfffffe00) at sql_parse.cc:1175 #13 0xb7edb341 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0 #14 0xb7e084ee in clone () from /lib/tls/i686/cmov/libc.so.6 (gdb) bt #0 0xffffe410 in __kernel_vsyscall () #1 0xb7ee02ae in __lll_mutex_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0 #2 0xb7edcfbb in _L_mutex_lock_33 () from /lib/tls/i686/cmov/libpthread.so.0 #3 0xbfb11308 in ?? () #4 0x00000010 in ?? () #5 0x083b49bb in safe_mutex_lock (mp=0x85ed520, file=0x841a4b3 "mysql_priv.h", line=1534) at thr_mutex.c:116 #6 0x0816bc13 in THD (this=0x8c67040) at mysql_priv.h:1534 #7 0x0817e25b in handle_connections_sockets (arg=0x0) at sql_list.h:421 #8 0x0817d824 in main (argc=2, argv=0xbfb11574) at mysqld.cc:3523
[15 Feb 2007 21:30]
Harrison Fisk
This bug also appears to be triggered when you run out of space due to relay_log_space_limit. You can repeat it very easily by setting relay_log_space_limit to a small size and then turn off the SQL_THREAD.
[5 Jun 2007 15:59]
Damien Katz
I've been unable to reproduce this manually. I'm checking to see if a test case can be written to reproduce the problem.
[5 Jun 2007 17:45]
Harrison Fisk
Test case to reproduce the hang
Attachment: rpl_ssl_hang.test (application/octet-stream, text), 1.24 KiB.
[5 Jun 2007 17:45]
Harrison Fisk
opt file for test case
Attachment: rpl_ssl_hang-slave.opt (application/octet-stream, text), 29 bytes.
[5 Jun 2007 17:50]
Harrison Fisk
I have uploaded a mysql-test case to this issue. I have run it against MySQL 5.0.42 as: /usr/local/mysql/mysql-test$ ./mysql-test-run rpl_ssl_hang In another shell, I then connected to the master database and listed the processlist over and over as: mysqladmin -i 1 -P 9306 -h 127.0.0.1 pro After a bit of time, the display would freeze up and stop producing output. The amount of time would vary from about 60 seconds to 130 seconds, but it would always do so for me. The process on the master would be 'Writing to net' and the slave would be 'Waiting for the slave SQL thread to free enough relay log space'.
[27 Jun 2007 20:46]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/29784 ChangeSet@1.2507, 2007-06-27 16:46:23-04:00, dkatz@damien-katzs-computer.local +1 -0 Bug #22993 Master hangs in SSL replication when the slave runs out of disk space Removed unused close_notify "alert" that was causing hangs when the connection was paused or slow.
[13 Jul 2007 16:58]
Damien Katz
Duplicate of Bug #29579 Clients using SSL can hang the server