Bug #40770 Server Crash when running with triggers including variable settings (rpl_sys)
Submitted: 17 Nov 2008 3:25 Modified: 9 Dec 2008 0:22
Reporter: Omer Barnir (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Stored Routines Severity:S2 (Serious)
Version:5.1.29, 5.1.30 OS:Linux (64bit / All OS))
Assigned to: Ramil Kalimullin CPU Architecture:Any
Tags: regression
Triage: Triaged: D1 (Critical)

[17 Nov 2008 3:25] Omer Barnir
Description:
The master server crashes within seconds when running the the rpl_sys test (TPCB scenario) with statement based replication. The problem is also observed with 5.1.29 but was not observed with 5.1.27 or 5.1.28. 

The problem is limited to statement based replication and is not observed when running in RBR or MBR. 

Note: replication does not have to be running on the slave, it is enough that the master is logging in statement mode.

Seconds after the test is started the server crashes with:
081117  6:11:35 [Note] /data1/5.1.30/bin5130_bin64/bin/mysqld: ready for connections.
Version: '5.1.30-enterprise-gpl-advanced-log'  socket: '/data1/5.1.30/bin5130_bin64/mysql-test/var/tmp/master.sock'  port: 9306  MySQL Enterprise Server - Advanced Edition (GPL)
081117  6:11:47 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=2
max_threads=151
threads_connected=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 60542 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x7077620
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x40ac3100 thread_stack 0x40000
/data1/5.1.30/bin5130_bin64/bin/mysqld(my_print_stacktrace+0x2e)[0x895f2e]
/data1/5.1.30/bin5130_bin64/bin/mysqld(handle_segfault+0x322)[0x5d1e82]
/lib64/libpthread.so.0[0x3ba880de70]
/lib64/libc.so.6(realloc+0x79)[0x3ba8072ce9]
/data1/5.1.30/bin5130_bin64/bin/mysqld(my_realloc+0x2c)[0x88180c]
/data1/5.1.30/bin5130_bin64/bin/mysqld[0x5464b1]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN22Item_func_set_user_var6updateEv+0x245)[0x54e485]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN12set_var_user6updateEP3THD+0xd)[0x5f24ad]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_Z17sql_set_variablesP3THDP4ListI12set_var_baseE+0x89)[0x5f25b9]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_Z21mysql_execute_commandP3THD+0x2b30)[0x5e4950]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x1c)[0x720edc]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN13sp_lex_keeper23reset_lex_and_exec_coreEP3THDPjbP8sp_instr+0x1e5)[0x725415]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x114)[0x726034]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN7sp_head7executeEP3THD+0x57c)[0x7237bc]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN7sp_head15execute_triggerEP3THDPK19st_mysql_lex_stringS4_P13st_grant_info+0x1af)[0x72405f]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN19Table_triggers_list16process_triggersEP3THD14trg_event_type20trg_action_time_typeb+0xe3)[0x72f4e3]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_Z36fill_record_n_invoke_before_triggersP3THDPP5FieldR4ListI4ItemEbP19Table_triggers_list14trg_event_type+0xb7)[0x614977]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_Z12mysql_insertP3THDP10TABLE_LISTR4ListI4ItemERS3_IS5_ES6_S6_15enum_duplicatesb+0x7c4)[0x6506c4]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_Z21mysql_execute_commandP3THD+0x94b)[0x5e276b]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x1c)[0x720edc]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN13sp_lex_keeper23reset_lex_and_exec_coreEP3THDPjbP8sp_instr+0x1e5)[0x725415]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x114)[0x726034]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN7sp_head7executeEP3THD+0x57c)[0x7237bc]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x4cb)[0x723e4b]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_Z21mysql_execute_commandP3THD+0x21fe)[0x5e401e]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x357)[0x5e7647]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xcf3)[0x5e8343]
/data1/5.1.30/bin5130_bin64/bin/mysqld(_Z10do_commandP3THD+0xe6)[0x5e8d56]
/data1/5.1.30/bin5130_bin64/bin/mysqld(handle_one_connection+0x236)[0x5dba26]
/lib64/libpthread.so.0[0x3ba88062f7]
/lib64/libc.so.6(clone+0x6d)[0x3ba80ce85d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x708ff30 = SET @local_user_bi:= "rpl_sys_user@localhost"
thd->thread_id=11
thd->killed=NOT_KILLED

 

How to repeat:
Download the attached file and extract it in the mysql-test directory.

Run ./run_crash

The script performs the following:
  - starts the server with replication 
  - sets binlog format to statement
  - start replication on the slave (is not a must to reproduce)
  - starts the rpl_sys test for 30 seconds with 5 concurrent clients

On the screen you will see the test client starting and then 'hanging' for about 30 seconds before the script aborts:

test_loop[1: 1:1]: TID 0 test: 'rpl_statement_setup'  Errors: No Errors. Test Passed OK

Done initialization of stress database by tests from
/data1/5.1.30/bin5130_bin64/mysql-test/rpl_SBR_init.txt file.

#############################################################
                  STRESS TEST RUNNING STAGE
#############################################################
main: Thread ID 1 TID 1 started
test_loop[0:0 0:1]: TID 1 test: 'rpl_row_sys_pinsdel1'  Errors: No Errors. Test Passed OK

test_loop[0:0 0:2]: TID 1 test: 'rpl_row_sys_driver2'  Errors: No Errors. Test Passed OK

test_loop[0:0 0:3]: TID 1 test: 'rpl_row_sys_driver3'  Errors: No Errors. Test Passed OK

test_loop[0:0 0:4]: TID 1 test: 'rpl_row_sys_driver2'  Errors: No Errors. Test Passed OK

test_loop[0:0 0:5]: TID 1 test: 'rpl_row_sys_driver3'  Errors: No Errors. Test Passed OK

main: Thread ID 2 TID 2 started  <-- at this point the server is already down
main: Thread ID 3 TID 3 started
main: Thread ID 4 TID 4 started
main: Thread ID 5 TID 5 started
7511: Got INT signal-------------------------------------------
test_loop[0:0 0:6]: TID 1 test: 'rpl_row_sys_pinsdel1'  Errors: Severity S3: 500

test_loop[0:0 0:7]: TID 2 test: 'rpl_row_sys_driver3'  Errors: Severity S3: 500

test_loop[0:0 0:8]: TID 3 test: 'rpl_row_sys_driver4'  Errors: Severity S3: 500

test_loop[0:0 0:9]: TID 4 test: 'rpl_row_sys_driver5'  Errors: Severity S3: 500

test_loop[0:0 0:10]: TID 5 test: 'rpl_row_sys_driver1'  Errors: Severity S3: 500
[17 Nov 2008 3:26] Omer Barnir
Files needed to reproduce the crash

Attachment: bug_files.tar.gz (application/x-gzip, text), 10.82 KiB.

[17 Nov 2008 6:22] Lars Thalmann
From the stack trace, this crash seems to happens in the server code.

Reassigning to Joro.
[17 Nov 2008 7:54] Ramil Kalimullin
Couldn't repeat on my 32-bit Linux box:
2.6.27.5-109.fc10.i686 #1 SMP Thu Nov 13 21:01:50 EST 2008 i686 i686 i386 GNU/Linux
[17 Nov 2008 15:46] Elena Stepanova
Reproduced on qa-srv-b box () on 5.1.29-rc binaries from mysql-advanced-5.1.29-rclinux-i686-glibc23.tar.gz.
[17 Nov 2008 16:31] Omer Barnir
Was able to reproduce with 32-bit binaries for 5.1.30 (mysql-enterprise-commercial-advanced-5.1.30-linux-i686-glibc23.tar.gz) - both with 'mysqld' and 'mysqld-debug' (tested on Linux Suse 10.3 (VM)
[17 Nov 2008 23:06] Elena Stepanova
During further experiments the problem was reproduced on Sparc Solaris 11 box (Sparc Solaris 10 binaries mysql-enterprise-commercial-advanced-5.1.30-solaris10-sparc-64bit.tar.gz).

For building binaries from the tree, a 'good' build to reproduce the bug seems to be compile-pentium-max.
[18 Nov 2008 1:12] Omer Barnir
Further investigation of the issue shows that the 'crashing factor' is related to the trigger definitions used in the test case and not to the fact that the server is set to SBR replication (the triggers are defined differently in the case of SBR and RBR/MBR as they include functions USER() and NOW().

In the case of SBR the trigers include:
   SET @local_user_bd:= "rpl_sys_user@localhost";
   SET @local_time_bd:= NOW();
   INSERT INTO test.insdel2_tracker  
       VALUES (NULL, @local_time_bd, @local_user_bd);

In the case of RBR/MBR the local variables are not used and the triggers 
include:
   INSERT INTO test.insdel2_tracker
       VALUES (NULL, NOW(), USER());
 
Running with the 'SBR' trigger definition when logging is RBR/MBR does not crash
Running without the trigger definition when logging is SBR does not crash

So the crash is related to the variable definition in the triggers.

Updating synopsis from 
    "Server Crash (master) when running SBR replication with 
     TPCB scenario (rpl_sys)"
to
    "Server Crash (master) when running with triggers including 
     variable settings (rpl_sys)"
[19 Nov 2008 5:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/59155

2716 Ramil Kalimullin	2008-11-19
      Fix for bug#40770: Server Crash when running with triggers including
      variable settings (rpl_sys)
      
      Problem: under certain conditions (e.g. user variables usage in triggers)
      accessing a user defined variable we may use a variables hash table that
      belongs to already deleted thread. It happens if
      thd= new THD;
      has the same address as just deleted thd as we use
      if (stored_thd == thd)
      to check. 
      That may lead to unpredictable results, server crash etc.
      
      Fix: use thread_id instead of thd address to distinguish threads.
      
      Note: no simple and repeatable test case.
[20 Nov 2008 11:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/59358

2721 Ramil Kalimullin	2008-11-20
      Fix for bug#40770: Server Crash when running with triggers including
      variable settings (rpl_sys)
      
      Problem: under certain conditions (e.g. user variables usage in triggers)
      accessing a user defined variable we may use a variables hash table that
      belongs to already deleted thread. It happens if
      thd= new THD;
      has the same address as just deleted thd as we use
      if (stored_thd == thd)
      to check.
      That may lead to unpredictable results, server crash etc.
      
      Fix: use thread_id instead of thd address to distinguish threads.
      
      Note: no simple and repeatable test case.
[8 Dec 2008 10:23] Bugs System
Pushed into 5.1.31  (revid:ramil@mysql.com-20081120112526-6nyxavhtzt9npthm) (version source revid:patrick.crews@sun.com-20081126180318-v685u61mpgoc176x) (pib:5)
[8 Dec 2008 11:34] Bugs System
Pushed into 6.0.9-alpha  (revid:ramil@mysql.com-20081120112526-6nyxavhtzt9npthm) (version source revid:ingo.struewing@sun.com-20081121151447-dtf2ofz2ys0zqed1) (pib:5)
[9 Dec 2008 0:22] Paul Dubois
Noted in 5.1.31, 6.0.9 changelog.

Accessing user variables within triggers could cause a server crash.
[19 Jan 2009 11:34] Bugs System
Pushed into 5.1.31-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090119095303-uwwvxiibtr38djii) (version source revid:tomas.ulin@sun.com-20090108105244-8opp3i85jw0uj5ib) (merge vers: 5.1.31-ndb-6.2.17) (pib:6)
[19 Jan 2009 13:10] Bugs System
Pushed into 5.1.31-ndb-6.3.21 (revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (version source revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (merge vers: 5.1.31-ndb-6.3.21) (pib:6)
[19 Jan 2009 16:16] Bugs System
Pushed into 5.1.31-ndb-6.4.1 (revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (version source revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (merge vers: 5.1.31-ndb-6.4.1) (pib:6)