Bug #41395 Maria: segfault in _ma_remove_not_visible_states (pushbuild2)
Submitted: 11 Dec 2008 14:23 Modified: 10 Mar 2009 17:02
Reporter: Guilhem Bichot Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Maria storage engine Severity:S3 (Non-critical)
Version:6.0-maria OS:Linux
Assigned to: Michael Widenius
Triage: Triaged: D1 (Critical) / R1 (None/Negligible) / E2 (Low)

[11 Dec 2008 14:23] Guilhem Bichot
Description:
Program terminated with signal 11, Segmentation fault.
#0 0x00fd9402 in __kernel_vsyscall () #0 0x00fd9402 in __kernel_vsyscall () #1 0x0089f067 in pthread_kill () from /lib/libpthread.so.0 #2 0x088013ca in my_write_core (sig=11) at stacktrace.c:307 #3 0x082b65ca in handle_segfault (sig=11) at mysqld.cc:2692 #4 <signal handler called> #5 0x08764943 in _ma_remove_not_visible_states (org_history=0xbea2910,
    all=0 '\0', trnman_is_locked=1 '\001') at ma_state.c:154
#6 0x08764abb in _ma_remove_not_visible_states_with_lock (share=0xa8558d88,
    all=0 '\0') at ma_state.c:206
#7 0x087c2d75 in collect_tables (str=0xa9006274,
    checkpoint_start_log_horizon=4376280075) at ma_checkpoint.c:1085
#8 0x087c138e in really_execute_checkpoint () at ma_checkpoint.c:195 #9 0x087c11f8 in ma_checkpoint_execute (level=CHECKPOINT_MEDIUM,
    no_wait=1 '\001') at ma_checkpoint.c:132
#10 0x087c1fa2 in ma_checkpoint_background (arg=0x1e) at ma_checkpoint.c:618

Last revision is guilhem@mysql.com-20081210170438-dguas7c2qaj434ty
and contains Monty's fix for ma_setup_live_state bug.
See also BUG#40711 (similar, though not identical, stack trace)

How to repeat:
probably concurrency related
[16 Dec 2008 9:45] Guilhem Bichot
Once again in pushbuild2:
#7 0x0074b621 in abort () from /lib/libc.so.6 #8 0x00781e5b in __libc_message () from /lib/libc.so.6 #9 0x00789b16 in _int_free () from /lib/libc.so.6 #10 0x0078d030 in free () from /lib/libc.so.6 #11 0x087ef621 in my_no_flags_free (ptr=0x9e53c30) at my_malloc.c:59 #12 0x0876d7ae in _ma_remove_not_visible_states (org_history=0xa33d940,
    all=0 '\0', trnman_is_locked=1 '\001') at ma_state.c:160
#13 0x0876d8b3 in _ma_remove_not_visible_states_with_lock (share=0xa856a3b0,
    all=0 '\0') at ma_state.c:206
#14 0x087cbb6d in collect_tables (str=0xa9092274,
    checkpoint_start_log_horizon=4319468726) at ma_checkpoint.c:1085
#15 0x087ca186 in really_execute_checkpoint () at ma_checkpoint.c:195 #16 0x087c9ff0 in ma_checkpoint_execute (level=CHECKPOINT_MEDIUM,
    no_wait=1 '\001') at ma_checkpoint.c:132
#17 0x087cad9a in ma_checkpoint_background (arg=0x1e) at ma_checkpoint.c:618
[25 Dec 2008 19:23] Philip Stoev
This crash also affects 6.0.9, nondebug binaries appear to be easier to crash.

A new repeatable test case will be uploaded shortly. To run, please unpack the ZIP so that a mysql-test/suite/bug39440 directory is created and then run:

engine_type=Maria MTR_VERSION=1 perl mysql-test-run.pl \
--stress \
--stress-init-file=bug39440_init.txt \
--stress-test-file=bug39440_run.txt \
--stress-suite=bug39440 \
--stress-test-duration=60000 \
--stress-threads=100
[25 Dec 2008 19:24] Philip Stoev
Test case for bug 39440

Attachment: bug39440.zip (application/x-zip-compressed, text), 18.02 KiB.

[8 Jan 2009 9:10] Guilhem Bichot
see also BUG#40711, where crash is only a few lines far from this one.
[22 Jan 2009 21:55] Michael Widenius
There have been notable changes in the affected code in the MySQL-5.1-maria tree that is related to this problem (some of the changes fixes a similar problem).

These changes are not yet in in MySQL-6.0 or MySQL-6.0-maria (can be seen by doing a diff of ma_state.c between MySQL-6.0-maria an MySQL-5.1-maria).

I have run the test in MySQL-5.1-maria, but not seen this problem.

We should retest this bug after we have done a new merge of MySQL-5.1-maria to MySQL-6.0
[28 Jan 2009 10:14] Oleksandr Byelkin
I was able to repeat in on mysql-maria with help of:
perl fork_big2.pl --user=root --thread-factor=5
It is rare but still present.
[10 Mar 2009 17:02] Guilhem Bichot
I ran this fork test above for 3 hours, no crash, with latest 6.0-maria. Must be a duplicate of BUG#40711.