Bug #31076 Server crashes when start slave is issued
Submitted: 18 Sep 2007 6:45 Modified: 8 Oct 2007 12:02
Reporter: Adam Donnison Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.1.22-rc OS:Linux (FC4)
Assigned to: CPU Architecture:Any

[18 Sep 2007 6:45] Adam Donnison
Description:
Upgraded from 5.1.20 to 5.1.22, and found the server was unstable with obvious crashes, and couldn't complete the mysql_upgrade.  Restarted the server with --skip-slave-start and --skip-grant-tables and re-ran mysql_upgrade which worked without a problem.  Then tried to issue START SLAVE; and immediately the server crashes and mysqld_safe restarts it.  This happens each time START SLAVE is issued.

Portion of error log:

070918  6:10:44 [Note] Slave SQL thread initialized, starting replication in log 'master-bin.000311' at position 164170623, relay log './relay-bin.000145' position: 5130
070918  6:10:44 - mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=536870912
read_buffer_size=131072
max_used_connections=1
max_threads=512
threads_connected=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1641286 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x822004f0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x82332f88, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x81eb577
0x8b68a5
0x82eebb2
0x82ef9d4
0xd13bd4
0x9154fe
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/refman/5.1/en/resolve-stack-dump.html
and follow instructions on how to resolve the stack trace.
Resolved stack trace is much more helpful in diagnosing the
problem, so please do resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at (nil)  is invalid pointer
thd->thread_id=24
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.
070918 06:10:44 mysqld_safe Number of processes running now: 0
070918 06:10:44 mysqld_safe mysqld restarted

How to repeat:
Start server with --skip-slave-start, once started issue START SLAVE with outstanding updates queued at the master.
[18 Sep 2007 6:53] Valeriy Kravchuk
Please, send a resolved stack trace. What version of MySQL is used on master?
[18 Sep 2007 7:08] Adam Donnison
Resolved stack trace for slave server 1:

0x81eb577 handle_segfault + 535
0x8db402 (?)
0x923678 (?)
0x957230 (?)
0x95d1e0 (?)
0x95d72b (?)
0x84e8f35 my_no_flags_free + 21
0x82913cd _ZN19Table_map_log_eventD0Ev + 29
0x82ee9c5 _Z20exec_relay_log_eventP3THDP17st_relay_log_info + 373
0x82ef9d4 handle_slave_sql + 996
0xdcdbd4 (?)
0x9c44fe (?)
New value of fp=(nil) failed sanity check, terminating stack trace!

Resolved stack trace for slave server 2:

0x81eb577 handle_segfault + 535
0x8b68a5 (?)
0x82eebb2 _Z20exec_relay_log_eventP3THDP17st_relay_log_info + 866
0x82ef9d4 handle_slave_sql + 996
0xd13bd4 (?)
0x9154fe (?)

Master is running 5.1.20-beta
[18 Sep 2007 9:03] Valeriy Kravchuk
Please, send statement that is in the binary log/relay log in these positions:

'master-bin.000311' at position 164170623, relay log './relay-bin.000145' position: 5130
[18 Sep 2007 10:21] Adam Donnison
relay-bin.000145 doesn't exist, the current relay log file is relay-bin.000416.

master-bin.000311 at position 164170623:

/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 164170623
# at 164170679
#7918 3:59:2 server id 436  end_log_pos 164170679       Table_map: `track`.`visi
ts_events` mapped to number 35084185
#7918 3:59:2 server id 436  end_log_pos 164170769       Write_rows: table id 350
84185 flags: STMT_END_F

BINLOG '
Bk3vRhO0AQAAOAAAALcLyQkAAJlXFwIAAAAABXRyYWNrAA12aXNpdHNfZXZlbnRzAAYJAwcPDwM=
Bk3vRhe0AQAAWgAAABEMyQkQAJlXFwIAAAEABv/AIE4AvvVDAQZN70YAK0Rvd25sb2Fkcy9NeVNR
TC00LjEvbXlzcWwtNC4xLjEyYS13aW4zMi56aXBPaAIC
'/*!*/;
# at 164170769
#7918 3:59:2 server id 436  end_log_pos 164170797       Intvar
SET INSERT_ID=21231039/*!*/;
# at 164170797
#7918 3:59:2 server id 436  end_log_pos 164171293       Query   thread_id=282788
24      exec_time=0     error_code=0
use track/*!*/;
SET TIMESTAMP=1190087942/*!*/;
[19 Sep 2007 6:18] Adam Donnison
Table structure for table affected in replication issue:

CREATE TABLE `visits` (
  `visits_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `myid` varchar(32) NOT NULL DEFAULT '',
  `src` varchar(64) NOT NULL DEFAULT '',
  `ip` int(10) unsigned NOT NULL DEFAULT '0',
  `cc` char(2) NOT NULL DEFAULT '',
  `org` varchar(80) DEFAULT NULL,
  `ref` varchar(255) NOT NULL DEFAULT '',
  `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `host` varchar(30) NOT NULL DEFAULT '',
  `entry` varchar(255) NOT NULL DEFAULT '',
  `visit_exit` varchar(255) NOT NULL DEFAULT '',
  `user_id` int(11) unsigned NOT NULL DEFAULT '0',
  `visit_start` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`visits_id`),
  KEY `ip` (`ip`),
  KEY `time` (`time`),
  KEY `user_id` (`user_id`)
) ENGINE=MyISAM AUTO_INCREMENT=21293381 DEFAULT CHARSET=latin1;
[20 Sep 2007 9:50] Mats Kindahl
I verified the bug with the help of Adam.
[20 Sep 2007 9:51] Mats Kindahl
The crash is caused by an uninitialized variable.
[20 Sep 2007 9:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/34421

ChangeSet@1.2594, 2007-09-20 11:58:49+02:00, mats@kindahl-laptop.dnsalias.net +1 -0
  BUG#31076 (Server crashes when start slave is issued):
  
  Row-based replication crashes when replicating from pre-5.1.22 to 5.1.22
  due to an uninitialized variable.
[20 Sep 2007 14:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/34433

ChangeSet@1.2594, 2007-09-20 16:31:05+02:00, mats@kindahl-laptop.dnsalias.net +3 -0
  BUG#31076 (Server crashes when start slave is issued):
  
  Row-based replication crashes when replicating from pre-5.1.22 to 5.1.22
  due to an uninitialized variable.
[22 Sep 2007 14:09] Joerg Bruehe
The fix is included in the latest build of 5.1.22-rc.
[24 Sep 2007 1:38] Adam Donnison
Latest build still crashes on starting replication.

Resolved stack trace:

0x81eb577 handle_segfault + 535
0x8ae6f2 (?)
0x84e8f55 my_no_flags_free + 21
0x82913ed _ZN19Table_map_log_eventD0Ev + 29
0x82ee9e5 _Z20exec_relay_log_eventP3THDP17st_relay_log_info + 373
0x82ef9f4 handle_slave_sql + 996
0xd13bd4 (?)
0x9154fe (?)

I've attached a hexdump binlog of the area surrounding the reported slave position.
[24 Sep 2007 1:39] Adam Donnison
Hex binlog of more recent crash.

Attachment: binlog_2.gz (application/x-gzip, text), 1.86 KiB.

[24 Sep 2007 8:51] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/34502

ChangeSet@1.2595, 2007-09-24 10:50:57+02:00, mats@kindahl-laptop.dnsalias.net +1 -0
  BUG#31076 (Server crashes when start slave is issued):
  Second patch to initailize more uninitialized variables.
[5 Oct 2007 17:56] Bugs System
Pushed into 5.1.23-beta
[8 Oct 2007 12:02] Jon Stephens
Thank you for your bug report. This issue has already been fixed in the latest released version of that product, which you can download at

  http://www.mysql.com/downloads/

Documented in 5.1.22 changelog as:

          Row-based replication from a pre-5.1.22 MySQL Server to a
          MySQL 5.1.22 was unstable due to an uninitialized variable.
[30 Oct 2007 17:24] Kevin Benton
I know this is a closed issue, however, it begs the question - how good is code coverage for MySQL today?  I would have hoped that this would have been caught by a code coverage test.