Bug #17716 Slave crash in net_clear on qnx
Submitted: 25 Feb 2006 9:13 Modified: 27 Feb 2006 13:16
Reporter: Magnus Blaudd Email Updates:
Status: Closed
Category:Server Severity:S2 (Serious)
Version:5.0.19 OS:Other (QNX 6.2)
Assigned to: Magnus Blaudd Target Version:

[25 Feb 2006 9:13] Magnus Blaudd
Description:
rpl000001                      [ fail ]

Errors are (from
/home/mysqldev/pb/mysql-5.0/push-paul@snake-hub.snake.net-20060223212135.info/mysql-5.0.19-standard/mysql-test/var/log/mysqltest-time)
:
mysqltest: At line 15: query 'stop slave' failed: 2013: Lost connection to MySQL server
during query
(the last lines may be the most important ones)

Ending Tests
Shutting-down MySQL daemon

Master(s) shutdown finished
Slave(s) shutdown finished
Resuming Tests

How to repeat:
Run ./mysql-test-run-pl on buildqnx2

Suggested fix:
Most likely caused by the fix for bug#2845.
[25 Feb 2006 9:16] Magnus Blaudd
Compiled a debug build on buildqnx2 from the latest distribution produced by pushbuild.
The trace files show that the slave crashes in net_clear. Debugging...

var/log/slave.log:
T@5    : | | | >vio_is_blocking
T@5    : | | | | exit: 0
T@5    : | | | <vio_is_blocking
T@5    : | | | >vio_read_buff
T@5    : | | | | enter: sd: 36, buf: 0x99a7018, size: 4
T@5    : | | | | >vio_read
T@5    : | | | | | enter: sd: 36, buf: 0x9997018, size: 16384
T@5    : | | | | | exit: 11
T@5    : | | | | <vio_read
T@5    : | | | <vio_read_buff
T@5    : | | | packet_header: Memory: 0x99a7018  Bytes: (4)
T@5    : | | | >vio_read_buff
T@5    : | | | | enter: sd: 36, buf: 0x99a7018, size: 7
T@5    : | | | <vio_read_buff
T@5    : | | | exit: Mysql handler: 9945c28
T@5    : | | <mysql_real_connect
T@5    : | | >my_b_flush_io_cache
T@5    : | | | >my_write
T@5    : | | | | my: Fd: 4  Buffer: 0x86fbff8  Count: 43  MyFlags: 20
T@5    : | | | <my_write
T@5    : | | <my_b_flush_io_cache
T@5    : | | exit: slave_was_killed: 0
T@5    : | <connect_to_master
T@5    : | >sql_print_information
T@5    : | | >vprint_msg_to_log
T@5    : | | | >print_buffer_to_file
T@5    : | | | | enter: buffer: Slave I/O thread: connected to master
'root@127.0.0.1:10170',  replication started in log 'FIRST' at position 4
T@5    : | | | <print_buffer_to_file
T@5    : | | <vprint_msg_to_log
T@5    : | <sql_print_information
T@5    : | >my_malloc
T@5    : | | my: size: 108  my_flags: 24
T@5    : | | exit: ptr: 0x86e1db0
T@5    : | <my_malloc
T@5    : | >my_malloc
T@5    : | | my: size: 18  my_flags: 32
T@5    : | | exit: ptr: 0x86e2e40
T@5    : | <my_malloc
T@5    : | >mysql_real_query
T@5    : | | enter: handle: 9945c28
T@5    : | | query: Query = 'SELECT UNIX_TIMESTAMP()'
T@5    : | | >mysql_send_query
T@5    : | | | enter: rpl_parse: 0  rpl_pivot: 1
T@5    : | | <mysql_send_query
T@5    : | | >cli_advanced_command
T@5    : | | | >net_clear
[25 Feb 2006 16:59] Magnus Blaudd
The compile time constant FD_SETSIZE needs to be defined before we include
"<sys/select.h>". That is because the bit array type "fd_set"'s size is calculated using
from it. When it's not defined it will default to 32 thus defining a 32 bits array,  as
soon as we do a FD_SET(fd, &sfds) where fd ios higher then 32 it will write outside the
variable. As in the tracefile example above the fd(or sd as it's called here)  was 36 and
thus we where writing ouside the bitarray.
[25 Feb 2006 17:02] Magnus Blaudd
http://www.qnx.com/developers/docs/momentics621_docs/neutrino/lib_ref/s/select.html
[27 Feb 2006 9:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/3172
[27 Feb 2006 13:16] Magnus Blaudd
Pushed to 5.0.19