Bug #50160 Server with semi-sync replication crashes in ActiveTranx::is_tranx_end_pos
Submitted: 7 Jan 2010 21:02 Modified: 20 Jan 2010 9:25
Reporter: Elena Stepanova Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.5.1-m2 OS:Any
Assigned to: Assigned Account
Triage: Triaged: D1 (Critical)

[7 Jan 2010 21:02] Elena Stepanova
Description:
100107 23:29:26 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=6
max_threads=151
threads_connected=6
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 60561 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x16706190
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x4180d100 thread_stack 0x40000
bin/mysqld(my_print_stacktrace+0x2e)[0x8bcc2e]
bin/mysqld(handle_segfault+0x322)[0x5cc032]
/lib64/libpthread.so.0[0x33a8c0de70]
lib/plugin/semisync_master.so(_ZN11ActiveTranx16is_tranx_end_posEPKcy+0x6c)[0x2aaaab9769bc]
lib/plugin/semisync_master.so(_ZN18ReplSemiSyncMaster16updateSyncHeaderEPhPKcyj+0xed)[0x2aaaab976b3d]
bin/mysqld(_ZN24Binlog_transmit_delegate17before_send_eventEP3THDtP6StringPKcy+0x114)[0x759ad4]
bin/mysqld(_Z17mysql_binlog_sendP3THDPcyt+0x753)[0x714f93]
bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x76e)[0x5e2dae]
bin/mysqld(_Z10do_commandP3THD+0xe3)[0x5e3e73]
bin/mysqld(handle_one_connection+0x23e)[0x5d616e]
/lib64/libpthread.so.0[0x33a8c062f7]
/lib64/libc.so.6(clone+0x6d)[0x33a80d1b6d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at (nil) is an invalid pointer
thd->thread_id=2
thd->killed=NOT_KILLED

How to repeat:
- unpack the attached archive in <mysql-basedir>/mysql-test. It should create
'stress_test_basedir' folder;
- run as  
perl ./stress_test_basedir/run.pl

Notes:
run.pl is a wrapper for MTR to start server, and mysql-stress-test to run a 5-thread stress test. 
For mysql-stress-test you need to have perl with threads on the top of your path (or please modify run.pl to call mysql-stress-test.pl with the right perl).

The screen output should say 'Waiting for server(s) to exit...' for a few seconds (sleep time for the server to start), and then start rapidly producing lines like
test_loop[0:0 0:377]: TID 2 test: 'stress2'  Errors: No Errors. Test Passed OK
test_loop[0:0 0:378]: TID 1 test: 'stress1'  Errors: No Errors. Test Passed OK

When/if the problem is hit, the output says something like
test_loop[0:0 0:7]: TID 2 test: 'stress1'  Errors: Severity S1: 15 (thread aborting)
and the test exits
 
If the server does not start fast enough, the test might abort immediately; in this case please increase sleep time in run.pl.
[7 Jan 2010 21:22] Elena Stepanova
All threads bt

Attachment: all_threads_bug50160.out (application/octet-stream, text), 17.99 KiB.

[7 Jan 2010 21:23] Elena Stepanova
Stress test

Attachment: stress_test_basedir.tar.gz (application/gzip, text), 21.21 KiB.

[7 Jan 2010 21:23] Elena Stepanova
See also bug#50157 and bug#50163. The attached stress test is the same for all three bugs, and it randomly ends with one of these crashes.
All three problems might be the same, but since the produced stack trace is noticeably different in each case, I'm logging each one separately.
[20 Jan 2010 9:25] Zhenxing He
dup of Bug#50157