Bug #33943 SEGV fault replication thd lock assert failure thd: 0xc746b00
Submitted: 19 Jan 2008 22:20 Modified: 14 Mar 2008 5:27
Reporter: Ole Bjrn Hessen Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.1.22-rc OS:Linux (Red Hat Enterprise Linux Server release 5 (Tikanga))
Assigned to: CPU Architecture:Any
Tags: mysqld, replication, server

[19 Jan 2008 22:20] Ole Bjrn Hessen
Description:
Master - slave setup on Linux x86_64 RHEL 5 on 5.1.22-rc.
Replication server dies with SEGV.
Both intel-cc binary downloaded, and source compiled self with gcc.
Tested on a quad Xeon and dual Xeon.

The mysqld-debug binary says assert error:
mysqld-debug: sql_base.cc:4171: int lock_tables(THD *, TABLE_LIST *, unsigned int, bool *): Assertion `thd->lock == 0' failed.
080119 22:53:45 - mysqld got signal 6;

The mysqld binary says:
 mysqld got signal 11;
thd: 0xc746b00
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xb, backtrace may not be correct.
Bogus stack limit or frame pointer, fp=0xb, stack_bottom=0x44710000, thread_stack=262144, aborting backtrace.
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at (nil)  is invalid pointer
thd->thread_id=1302

debug binary breakpoint

Program received signal SIGABRT, Aborted.
0x00000036bf430015 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00000036bf430015 in raise () from /lib64/libc.so.6
#1  0x00000036bf431980 in abort () from /lib64/libc.so.6
#2  0x00000036bf429726 in __assert_fail () from /lib64/libc.so.6
#3  0x000000000070a028 in lock_tables (thd=0x2aae2c000a30, tables=0x2aae2c007980, count=1, need_reopen=0x450c7768) at sql_base.cc:4171
#4  0x0000000000775a4a in mysql_update (thd=0x2aae2c000a30, table_list=0x2aae2c007980, fields=@0x2aae2c0025a0, values=@0x2aae2c002998, conds=0x2aae2c008be0, order_num=0, order=0x0, limit=18446744073709551615, 
    handle_duplicates=DUP_ERROR, ignore=false) at sql_update.cc:222
#5  0x00000000006b4e79 in mysql_execute_command (thd=0x2aae2c000a30) at sql_parse.cc:2619
#6  0x00000000006be5f0 in mysql_parse (thd=0x2aae2c000a30, 
    inBuf=0x1a57d219 "UPDATE tnet_InaccessLog SET SessionStop = 0, Elapsed = 21290, InputOctets = 1246009, OutputOctets = 505174, srcprog = '/local/net/inaccess/bin/inaccessd', srcuser = 'tiadm', pid = 14396 WHERE id = 575"..., 
    length=204, found_semicolon=0x450c9cd8) at sql_parse.cc:5446
#7  0x00000000007b534e in Query_log_event::do_apply_event (this=0x1a57d140, rli=0x1a9a3310, 
    query_arg=0x1a57d219 "UPDATE tnet_InaccessLog SET SessionStop = 0, Elapsed = 21290, InputOctets = 1246009, OutputOctets = 505174, srcprog = '/local/net/inaccess/bin/inaccessd', srcuser = 'tiadm', pid = 14396 WHERE id = 575"..., q_len_arg=204) at log_event.cc:2086
#8  0x00000000007b4afd in Query_log_event::do_apply_event (this=0x1a57d140, rli=0x1a9a3310) at log_event.cc:1936
#9  0x000000000087b7e5 in Log_event::apply_event (this=0x1a57d140, rli=0x1a9a3310) at log_event.h:836
#10 0x0000000000875628 in exec_relay_log_event (thd=0x2aae2c000a30, rli=0x1a9a3310) at slave.cc:1845
#11 0x0000000000877658 in handle_slave_sql (arg=0x1a9a1f90) at slave.cc:2519
#12 0x00000036c00061b5 in start_thread () from /lib64/libpthread.so.0
#13 0x00000036bf4cd39d in clone () from /lib64/libc.so.6

InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
080119 23:17:53  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Last MySQL binlog file position 0 106, file name /local/db/mysql/data/net/prod/log-bin/log-bin.000039
080119 23:17:59  InnoDB: Started; log sequence number 1 2268183483
080119 23:17:59 [Note] Recovering after a crash using /local/db/mysql/data/net/prod/log-bin/log-bin
080119 23:17:59 [Note] Starting crash recovery...
080119 23:17:59 [Note] Crash recovery finished.
080119 23:18:00 [Note] Event Scheduler: Loaded 0 events
080119 23:18:00 [Note] /local/db/mysql/data/net/mysql/bin/mysqld-debug: ready for connections.
Version: '5.1.22-rc-debug-log'  socket: '/tmp/mysql.sock'  port: 15022  MySQL Community Server - Debug (GPL)
080119 23:18:00 [Note] Slave SQL thread initialized, starting replication in log 'log-bin.000032' at position 4442184, relay log '/local/db/mysql/data/net/prod/relaylog/relay-log.000028' position: 4442327
mysqld-debug: sql_base.cc:4171: int lock_tables(THD *, TABLE_LIST *, unsigned int, bool *): Assertion `thd->lock == 0' failed.
080119 23:18:00 - mysqld got signal 6;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=4194304000
read_buffer_size=131072
max_used_connections=0
max_threads=1500
threads_connected=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 19661875 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x19ca9a00
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x450c6950, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x6a032c
0x36bf430015
0x19008e50
New value of fp=0x450ca940 failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/refman/5.1/en/resolve-stack-dump.html
and follow instructions on how to resolve the stack trace.
Resolved stack trace is much more helpful in diagnosing the
problem, so please do resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x19cb75c9 = UPDATE tnet_InaccessLog SET SessionStop = 0, Elapsed = 21290, InputOctets = 1246009, OutputOctets = 505174, srcprog = '/local/net/inaccess/bin/inaccessd', srcuser = 'tiadm', pid = 14396 WHERE id = 5751345
thd->thread_id=2
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

How to repeat:
 /local/db/mysql/data/net/mysql/bin/mysqld-debug --defaults-file=/local/db/mysql/data/net/mysql-etc/my.cnf --basedir=/local/db/mysql/data/net/mysql --datadir=/local/db/mysql/data/net/prod/data   --relay-log=/local/db/mysql/data/net/prod/relaylog/relay-log --log-bin=/local/db/mysql/data/net/prod/log-bin/log-bin --replicate-ignore-db=obh --replicate-ignore-db=obh1 --replicate-ignore-db=nctest --log-slow-queries --log-warnings --low-priority-updates --log --log-error=/local/db/mysql/data/net/prod/data/batch-rrd7.nm1.telenor.net.err --pid-file=/local/db/mysql/data/net/prod/data/batch-rrd7.nm1.telenor.net.pid --port=15022

Suggested fix:
none yet.
[28 Jan 2008 14:07] Susanne Ebrecht
Many thanks for writing a bug report.

Please, can you tell us, what exactly you did, to get the issue.
[28 Jan 2008 14:49] Ole Bjrn Hessen
Hi,

I'm not sure how to respond to the request for more information.

A fix for this bug was to downgrade to mysql 5.0.45. Replication worked
perfect on 5.0.45. The only difference between the two setup was different
binaries. I conclude that there must be a bug in 5.1.22-rc.

Can you be more specific what you are asking about?

Kind regards.
[31 Jan 2008 17:30] Susanne Ebrecht
You missed to tell, that you upgraded from 5.0.45 to 5.1.22.

Did you follow the steps on:
http://dev.mysql.com/doc/refman/5.1/en/upgrading-from-5-0.html
[1 Feb 2008 11:22] Ole Bjrn Hessen
Hi,

this was part of an upgrade from 4.0.26 to 5.1.22-rc.

This was achieved by using mysqldump --opt all tables (but not mysql.*) on the 4.0.26 database and importing them into the mysql 5.1 database.

The master did run ok, but slave repeatedly crashed repeatedly about once an hour during the next 24hours. After that we downgraded to 5.0.45 (using a file copy of all files++) and it worked without any trouble.
[14 Feb 2008 5:27] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.1.23-rc, and inform about the results.
[15 Mar 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".