Bug #38883 thd_security_context is not thread safe, crashes?
Submitted: 19 Aug 2008 7:38 Modified: 17 Jun 2010 23:55
Reporter: Jeremy Cole (Basic Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.1.25, 5.1.28, 5.1.30,5.4.3 OS:Any
Assigned to: Gleb Shchepa
Tags: qc, regression
Triage: Triaged: D1 (Critical)

[19 Aug 2008 7:38] Jeremy Cole
Description:
We've had some systems crashing.  We're not really sure what they are doing when they are crashing, as it's fairly randomly timed, but they are crashing with this stack trace:

/usr/sbin/mysqld(print_stacktrace+0x1e) [0x736dfe]
/usr/sbin/mysqld(handle_segfault+0x346) [0x6106e6]
/lib/libpthread.so.0 [0x2b4f99b21410]
/lib/libc.so.6(strlen+0x30) [0x2b4f9a6995a0]
/usr/sbin/mysqld(String::append(char const*)+0x1c) [0x60c79c]
/usr/sbin/mysqld(thd_security_context+0x153) [0x602443]
/usr/sbin/mysqld(innobase_mysql_print_thd+0x1d) [0x7a735d]
/usr/sbin/mysqld(lock_print_info_all_transactions+0x7e) [0x7fc80e]
/usr/sbin/mysqld(srv_printf_innodb_monitor+0x186) [0x842fe6]
/usr/sbin/mysqld(srv_lock_timeout_and_monitor_thread+0x321) [0x843811]
/lib/libpthread.so.0 [0x2b4f99b1af1a]
/lib/libc.so.6(__clone+0x72) [0x2b4f9a6f05d2]

For the most part the security context generated by thd_security_context is static once a thread is connected, but there are two elements included which are very much not static, thd->proc_info and thd->query.  The following code seems to be flawed:

  if (thd->proc_info)
  {
    str.append(' ');
    str.append(thd->proc_info);
  }

  if (thd->query)
  {
    if (max_query_len < 1)
      len= thd->query_length;
    else
      len= min(thd->query_length, max_query_len);
    str.append('\n');
    str.append(thd->query, len);
  }

The check (to catch NULLs) is done, and then the string is appended, but because this function is called for OTHER threads, not your own, it's possible (and even quite likely on a busy system) that either thd->proc_info or thd->query are set to NULL after the if(...) but before the str.append(...) call.

How to repeat:
Get a system using InnoDB very busy and call SHOW ENGINE INNODB STATUS repeatedly or enable the innodb_monitor?  I've not been able to cause a crash reliably, but it occurs in production.

Suggested fix:
Either acquire a mutex on the contents of thd->proc_info and thd->proc_info in thd_security_context or don't try to str.append() them at all.
[19 Aug 2008 8:50] Shane Bester
probably easy to verify if you insert a sleep(10) after the if() and before the append ?
[19 Aug 2008 17:25] Jeremy Cole
Hi Shane,

Probably so, but this is really one of those cases that needs to be verified for absolute correctness in algorithm, rather than hacked up to test it.  (As is the case with almost any thread safety/mutex issue.)

Regards,

Jeremy
[29 Sep 2008 16:07] Mikhail Izioumtchenko
Hi, Jeremy, any news on this? The problem is interesting, but 
SHOW ENGINE INNODB STATUS is something that everyone would perform every few seconds in
any kind of performance test, yet we hadn't heard any complaints. What isn't broken I
wouldn't consider fixing.  Is there a way to reproduce this?  Do you think trying short
lived connections would do it?
[29 Sep 2008 16:27] Heikki Tuuri
We have known this race for ages. The problem is how to fix it.

Vasil, can you come up with some idea?

--Heikki
[19 Oct 2008 5:42] bhushan uparkar
We are having the similar problem with version 5.1.28  as well, here is the stack trace from the crash, 

=======
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = (nil) thread_stack 0x20000
/usr/sbin/mysqld(my_print_stacktrace+0x29) [0x944709]
/usr/sbin/mysqld(handle_segfault+0x346) [0x611d76]
/lib/libpthread.so.0 [0x2b78afe37410]
/lib/libc.so.6(strlen+0x30) [0x2b78b09af5a0]
/usr/sbin/mysqld(String::append(char const*)+0x1c) [0x60dddc]
/usr/sbin/mysqld(thd_security_context+0x153) [0x604413]
/usr/sbin/mysqld(innobase_mysql_print_thd+0x1d) [0x7af2bd]
/usr/sbin/mysqld(lock_print_info_all_transactions+0x7e) [0x80538e]
/usr/sbin/mysqld(srv_printf_innodb_monitor+0x186) [0x84abf6]
/usr/sbin/mysqld(srv_lock_timeout_and_monitor_thread+0x321) [0x84b421]
/lib/libpthread.so.0 [0x2b78afe30f1a]
/lib/libc.so.6(__clone+0x72) [0x2b78b0a065d2]

========
Let me know if you need additional things on this crash.

Thanks
[21 Oct 2008 13:25] Shane Bester
seeing a crash in 5.1.28 64-bit windows:

mysqld.exe!String::append Line 461
mysqld.exe!thd_security_context
mysqld.exe!innobase_mysql_print_thd
mysqld.exe!trx_print
mysqld.exe!lock_print_info_all_transactions
mysqld.exe!srv_printf_innodb_monitor
mysqld.exe!innodb_show_status
mysqld.exe!ha_show_status
mysqld.exe!mysql_execute_command
mysqld.exe!mysql_parse
mysqld.exe!dispatch_command
mysqld.exe!do_command
mysqld.exe!handle_one_connection
mysqld.exe!pthread_start
mysqld.exe!_callthreadstart
mysqld.exe!_threadstart
[22 Oct 2008 8:10] Vasil Dimov
Jeremy's analysis looks right.

One solution is to add a mutex in the THD struct that protects the query, proc_info and other members and lock that mutex in thd_security_context().

Since this is in MySQL code, this bug should not be classified as an InnoDB bug.
[22 Oct 2008 12:00] Jeremy Cole
Hi Vasil,

I would suggest that the mutex would be a rather heavy handed approach, since it would either need to lock each thread in turn, or effectively freeze the server in order to get a rather "fluffy" result for SHOW ENGINE INNODB STATUS.

Regarding whether it's a MySQL or InnoDB bug, I would think that they old shopkeep's principle (or "Pottery Barn rule"[1]) applies here: You break it, you bought it.  This code seems to have been added specifically for InnoDB, and it's only used by InnoDB, and only in that one place, so I would say it is in fact an InnoDB bug.

Regards,

Jeremy
[22 Oct 2008 12:03] Jeremy Cole
Oops, I forgot my footnote:

[1] In case you want to read up on the American jargon: http://en.wikipedia.org/wiki/Pottery_Barn_rule
[29 Oct 2008 19:02] Alex Tomic
I'm pretty reliably getting this crash (about twice a day) on a new 5.1.29 setup after adding an InnoDB table that we use for a lot of repeated lookups -- the previous 5.1.26 system was stable for over a month before we started doing this though so it all makes sense.

Is there a way to avoid hitting this problem, outside of telling everyone not to run the show innodb status command?
[30 Oct 2008 13:06] Kristofer Pettersson
Hi Alex. Could you please supply a test case which replicate your issue?
[3 Nov 2008 16:07] Kristofer Pettersson
Progress note: 

InnoDB fails to call innobase_mysql_prepare_print_arbitrary_thd on each occasion where thd_security_context is used. This means that the memory in thd->query is unprotected.

#3  0xb7d3410e in __assert_fail () from /lib/tls/i686/cmov/libc.so.6
#4  0x082aa28c in thd_security_context (thd=0x8f84ac8, 
    buffer=0xb35e1fd4 "h\204$·ø\037^³H¿W\bÌþ$·v\231\211\bÄ\002", length=1024, 
    max_query_len=3000) at sql_class.cc:329
#5  0x0852dfab in innobase_mysql_print_thd (f=0x8f23828, input_thd=0x8f84ac8, 
    max_query_len=3000) at handler/ha_innodb.cc:745
#6  0x085b2aa4 in trx_print (f=0x8f23828, trx=0xb7248068, max_query_len=3000)
    at trx/trx0trx.c:1760
#7  0x085675e8 in lock_deadlock_recursive (start=0xb7248468, trx=0xb7248068, 
    wait_lock=0xb724a6e8, cost=0xb35e24f4, depth=1) at lock/lock0lock.c:3269
#8  0x0856782f in lock_deadlock_recursive (start=0xb7248468, trx=0xb7248468, 
    wait_lock=0xb7248e30, cost=0xb35e24f4, depth=0) at lock/lock0lock.c:3357
#9  0x0856730f in lock_deadlock_occurs (lock=0xb7248e30, trx=0xb7248468)
    at lock/lock0lock.c:3146
#10 0x08565be0 in lock_rec_enqueue_waiting (type_mode=2565, 
    rec=0xb6cac070 "supremum", index=0xb724a468, thr=0xb72451e8)
    at lock/lock0lock.c:1807
#11 0x0856a0f7 in lock_rec_insert_check_and_lock (flags=0, rec=0xb6cac07e "", 
    index=0xb724a468, thr=0xb72451e8, inherit=0xb35e25f4)
    at lock/lock0lock.c:4796
#12 0x0853368f in btr_cur_ins_lock_and_undo (flags=0, cursor=0xb35e27e8, 
    entry=0xb724f068, thr=0xb72451e8, inherit=0xb35e25f4) at btr/btr0cur.c:873
#13 0x0853399a in btr_cur_optimistic_insert (flags=0, cursor=0xb35e27e8, 
    entry=0xb724f068, rec=0xb35e283c, big_rec=0xb35e282c, thr=0xb72451e8, 
    mtr=0xb35e2848) at btr/btr0cur.c:1036
#14 0x0858c0d3 in row_ins_index_entry_low (mode=2, index=0xb724a468, 
    entry=0xb724f068, ext_vec=0x0, n_ext_vec=0, thr=0xb72451e8)
    at row/row0ins.c:2093
#15 0x0858c3c8 in row_ins_index_entry (index=0xb724a468, entry=0xb724f068, 
    ext_vec=0x0, n_ext_vec=0, thr=0xb72451e8) at row/row0ins.c:2180
#16 0x0858c5ab in row_ins_index_entry_step (node=0xb7245098, thr=0xb72451e8)
    at row/row0ins.c:2259
#17 0x0858c76c in row_ins (node=0xb7245098, thr=0xb72451e8)
    at row/row0ins.c:2391
#18 0x0858c90e in row_ins_step (thr=0xb72451e8) at row/row0ins.c:2501
#19 0x0858e0a1 in row_insert_for_mysql (mysql_rec=0x8f86bb0 "ý", 
    prebuilt=0xb7250868) at row/row0mysql.c:1153
#20 0x08531207 in ha_innobase::write_row (this=0x8f86a60, record=0x8f86bb0 "ý")
    at handler/ha_innodb.cc:3495
#21 0x083f72b1 in handler::ha_write_row (this=0x8f86a60, buf=0x8f86bb0 "ý")
    at handler.cc:4567
#22 0x083640a4 in write_record (thd=0x8fa34c0, table=0x8f837b8, 
    info=0xb35e3048) at sql_insert.cc:1567
#23 0x08368843 in mysql_insert (thd=0x8fa34c0, table_list=0x8fad840, 
    fields=@0x8fa4af8, values_list=@0x8fa4b1c, update_fields=@0x8fa4b10, 
    update_values=@0x8fa4b04, duplic=DUP_ERROR, ignore=false)
    at sql_insert.cc:812
#24 0x082d350f in mysql_execute_command (thd=0x8fa34c0) at sql_parse.cc:3068
#25 0x082d9e08 in mysql_parse (thd=0x8fa34c0, 
    inBuf=0x8fad798 "INSERT INTO t1 VALUES (0)", length=25, 
    found_semicolon=0xb35e409c) at sql_parse.cc:5789
#26 0x082dab44 in dispatch_command (command=COM_QUERY, thd=0x8fa34c0, 
    packet=0x8fa5501 "INSERT INTO t1 VALUES (0)", packet_length=25)
    at sql_parse.cc:1200
#27 0x082dbf8f in do_command (thd=0x8fa34c0) at sql_parse.cc:857
#28 0x082c7f17 in handle_one_connection (arg=0x8fa34c0) at sql_connect.cc:1115
#29 0xb7eed4fb in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
[3 Nov 2008 18:07] Alex Tomic
On the affected system, I changed one of the heavily queried tables to MyISAM and it has stayed up for nearly 5 days, so I at least have a pretty good workaround. If i get a chance I will try apply the patch to one of our test systems to see if we can reproduce the problem.
[4 Nov 2008 9:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/57755

2686 Kristofer Pettersson	2008-11-04
      Bug#38883 thd_security_context is not thread safe, crashes?
      
      Innodb monitor could cause a server crash because of invalid access to a
      shared variable in a concurrent environment.
            
      This patch adds a guard to protect against crashes but not against
      inconsistent values because of performance reasons.
      
      This patch is not complete without additional changes to InnoDB where all
      access to thd_security_context have a preceding call to
      innobase_mysql_prepare_print_arbitrary_thd
[4 Nov 2008 12:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/57779

2686 Kristofer Pettersson	2008-11-04
      Bug#38883 thd_security_context is not thread safe, crashes?
            
      Innodb monitor could cause a server crash because of invalid access to a
      shared variable in a concurrent environment.
                  
      This patch adds a guard to protect against crashes but not against
      inconsistent values because of performance reasons.
            
      This patch is not complete without additional changes to InnoDB where all
      access to thd_security_context have a preceding call to
      innobase_mysql_prepare_print_arbitrary_thd
[6 Nov 2008 11:54] Marko Mäkelä
This is basically a duplicate of Bug #3596, which was partially fixed in MySQL 4.0.20.
[6 Nov 2008 12:34] Heikki Tuuri
We should have a MySQL handler function: return_info_on_thd_in_a_buffer(thd).
That would take a snapshot of the query string etc., under the protection of a mutex, and return it in a buffer. Then InnoDB could safely print that info in SHOW INNODB STATUS\G.
[6 Nov 2008 12:38] Heikki Tuuri
char *thd_security_context(THD *thd, char *buffer, unsigned int length,
                            unsigned int max_query_len)

Actually, I see Kristofer has implemented the function :).
[6 Nov 2008 12:41] Vasil Dimov
Marko will investigate calling innobase_mysql_prepare_print_arbitrary_thd() before thd_security_context() and other solutions to this bug.
[6 Nov 2008 12:50] Vasil Dimov
Grab!
[6 Nov 2008 12:52] Heikki Tuuri
Vasil, please put InnoDB code in the shape Kristofer wants. When InnoDB prints thd->query of OTHER threads, it must obey:

sql_class.h in 5.1:

"
Points to the query associated with this statement. It's const, but
    we need to declare it char * because all table handlers are written
    in C and need to point to it.

    Note that (A) if we set query = NULL, we must at the same time set
    query_length = 0, and protect the whole operation with the
    LOCK_thread_count mutex. And (B) we are ONLY allowed to set query to a
    non-NULL value if its previous value is NULL. We do not need to protect
    operation (B) with any mutex. To avoid crashes in races, if we do not
    know that thd->query cannot change at the moment, one should print
    thd->query like this:
      (1) reserve the LOCK_thread_count mutex;
      (2) check if thd->query is NULL;
      (3) if not NULL, then print at most thd->query_length characters from
      it. We will see the query_length field as either 0, or the right value
      for it.
    Assuming that the write and read of an n-bit memory field in an n-bit
    computer is atomic, we can avoid races in the above way.
    This printing is needed at least in SHOW PROCESSLIST and SHOW INNODB
    STATUS.
"
[6 Nov 2008 13:17] Heikki Tuuri
After some thinking, I may have found the bug and the fix.

Jeremy Cole had the InnoDB monitor running. It prints the output of SHOW INNODB STATUS\G to the .err log every 15 seconds.

But it does not call nobase_mysql_prepare_print_arbitrary_thd(). That is the bug.

There is no bug when InnoDB prints thd info inside the lock0lock.c deadlock detection. Threads that are deadlocked are not progressing. Then thd->query cannot change.

/**********************************************************************
Outputs to a file the output of the InnoDB Monitor. */

void
srv_printf_innodb_monitor(
/*======================*/
        FILE*   file,           /* in: output stream */
        ulint*  trx_start,      /* out: file position of the start of
                                the list of active transactions */
        ulint*  trx_end)        /* out: file position of the end of
                                the list of active transactions */
{
        double  time_elapsed;
        time_t  current_time;
        ulint   n_reserved;
[6 Nov 2008 13:25] Heikki Tuuri
Hmm... now I notice the innobase_mysql_prepare_print_arbitrary_thd() call is there. The bug not found :(.
[6 Nov 2008 13:33] Heikki Tuuri
I do not see how:

"
  if (thd->proc_info)
  {
    str.append(' ');
    str.append(thd->proc_info);
  }

  if (thd->query)
  {
    if (max_query_len < 1)
      len= thd->query_length;
    else
      len= min(thd->query_length, max_query_len);
    str.append('\n');
    str.append(thd->query, len);
  }
"

could crash in the InnoDB monitor, since the InnoDB monitor does acquire the appropriate MySQL mutex.

We need a better stack trace, inside gdb, to find out what is wrong. Does MySQL protect thd->query as documented in the source code?
[6 Nov 2008 13:34] Marko Mäkelä
Could it be that MySQL is somewhere assigning thd->query = NULL without holding the LOCK_thread_count mutex, as specified in the comment in sql/sql_class.h?
[6 Nov 2008 13:54] Marko Mäkelä
In reply to [3 Nov 17:07] Kristofer Pettersson,

I reviewed the InnoDB source code, and the only calls to thd_security_context() are via trx_print(). The only calls to trx_print() to other than the current thread's transaction object are in lock0lock.c: lock_deadlock_recursive() and lock_print_info_all_transactions().

When threads are waiting for a lock, their thd->query should not be changing. So, there should not be a bug in lock_deadlock_recursive(), and the assertion failure should be a bogus one. We don't want to acquire the mutex when checking for deadlocks, because it should not be needed.

According to my analysis, lock_print_info_all_transactions() is holding LOCK_thread_count when calling trx_print(). lock_print_info_summary() initially acquires the mutex.

The fix of caching trx->query to a local variable is kind of sweeping the bug under the carpet.
[6 Nov 2008 14:45] Davi Arnaut
As can be seen by the backtrace, the crash is on the String::append of proc_info and that's what the latest patch tries to fix -- proc_info may change at any time and without holding locks, but it should always point to constant strings. It also adds a assertion to ensure that LOCK_thread_count is being held due to constraints on THD::query.
[6 Nov 2008 20:35] Marko Mäkelä
Davi,
I didn't see line numbers in the stack traces. The assertion that http://lists.mysql.com/commits/57779 adds to thd_security_context() would fail in the trx_print() calls in lock_deadlock_recursive(), as Kristofer has reported. If assignments to thd->proc_info are not protected by LOCK_thread_count, that would explain this bug. And if proc_info is always pointing to a constant string when not a null pointer, it should be safe to cache it and use the cached value.

Because lock_deadlock_recursive() is holding the InnoDB kernel_mutex, it can't acquire LOCK_thread_count without breaking an assumption that Heikki made when he fixed Bug #3596. Acquiring LOCK_thread_count there would cause an ut_ad() assertion failure, or it could cause a deadlock.

InnoDB guarantees the absence of deadlocks by its notion of a "latching order". In existing code, the LOCK_thread_count has a higher rank than the kernel_mutex. If lock_deadlock_recursive() acquired the LOCK_thread_count mutex while holding kernel_mutex, it could deadlock with a thread that is doing SHOW ENGINE INNODB STATUS. The srv_printf_innodb_monitor() thread would have acquired LOCK_thread_count and it'd be waiting for kernel_mutex. At the same time, the thread executing lock_deadlock_recursive() would be holding the kernel_mutex and soon waiting for LOCK_thread_count.

If we acquired the LOCK_thread_count before acquiring the kernel_mutex and checking for deadlocks in lock_deadlock_recursive(), it would cause severe locking contention. It could essentially serialize the execution of InnoDB. Therefore, the safe_mutex_assert_owner(&LOCK_thread_count) added to thd_security_context() in http://lists.mysql.com/commits/57779 has to be removed.

Again, please note that thd->query cannot possibly change while the thd in question is waiting for a lock inside InnoDB. InnoDB takes care of implementing the innodb_lock_wait_timeout; the thread can't finish executing the statement while lock_deadlock_recursive() is invoking thd_security_context(thd) via trx_print().
[6 Nov 2008 22:04] Davi Arnaut
The problem is probably not in THD::query because it does not use the String::append(char const*) method that further down the road leads to the crash. This string method is used to the concatenate the host, ip, user or proc_info and my bet is that the trouble was on proc_info. If we had the binary, we could check the address to precisely verify which one.

Whatever InnoDB does to guarantee that the thread is paused is not important as the interface contract is that it's only safe to look at THD::query if the lock is being held. If InnoDB can't take the look, it should dump the text by itself. Again, it's not important whether the thread in question is sleeping or whatever, if it doesn't play by the interface rules, it might not blow up now but nothing guarantees that it will work in the future as developers will assume that the rules are being respected. If InnoDB knows what it is doing and takes up the risk, it should look by it self.

I think we have two options here. Either InnoDB implements its own function to dump the text and take up the risk or we document that the caller is responsible for locking and assumes the inherent risks.
[7 Nov 2008 9:44] Marko Mäkelä
If I remember correctly, the function thd_security_context() was copied from ha_innodb.cc to MySQL when the pluggable storage engine API was implemented in MySQL 5.1. A pluggable storage engine that does not define MYSQL_SERVER does not have access to all the data displayed by thd_security_context().

If possible, please update the documentation of the function to say that holding the LOCK_thread_count mutex is not necessary when the function is invoked on the currently running thread (current_thd) or a thread that is guaranteed to be waiting for a synchronization object. This should work because THD::query should only be changed by the currently running thread, not by other threads.

Users never reported crashes in lock_deadlock_recursive().
[7 Nov 2008 10:07] Davi Arnaut
You are still missing the point. LOCK_thread_count is the supported synchronization object for THD::query and what InnoDB does is use another object to serialize the access. From a pragmatic point of view this is fine but from a maintainability point of view this is a bad practice and is bond to break if the semantics of THD::query change slightly.
[7 Nov 2008 12:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/58172

2686 Kristofer Pettersson	2008-11-07
      Bug#38883 thd_security_context is not thread safe, crashes?
                  
      Innodb monitor could cause a server crash because of invalid access to a
      shared variable in a concurrent environment.
                        
      This patch adds a guard to protect against crashes but not against
      inconsistent values because of performance reasons.
[7 Nov 2008 15:52] Davi Arnaut
Probably not related to the reported crash, but there is a bug in thd_security_context related to calling THD::strmake in a thread that is not the current one.
[7 Nov 2008 21:51] Marko Mäkelä
I have some minor comments regarding the patch http://lists.mysql.com/commits/58172:

Please drop "current" from the documentation of the thd parameter of thd_security_context(), because thd can very well differ from current_thd:
+  @param thd current thread context

Furthermore, the function could cache thd->query to a local variable. While it should not be needed for correctness, it would likely result in less machine code and fewer memory references.

Regarding the API contract, it is a matter of definition. I agree that it'd be easier from the maintenance point of view to always acquire LOCK_thread_count.
[10 Nov 2008 10:53] Bugs System
Pushed into 6.0.9-alpha  (revid:kristofer.pettersson@sun.com-20081107121256-781d4a5ukq7im7te) (version source revid:kristofer.pettersson@sun.com-20081107140418-gisvpte6w6qj73ss) (pib:5)
[10 Nov 2008 11:36] Bugs System
Pushed into 5.1.31  (revid:kristofer.pettersson@sun.com-20081107121256-781d4a5ukq7im7te) (version source revid:kristofer.pettersson@sun.com-20081107135243-6sd3mzd3cm23l3vs) (pib:5)
[17 Nov 2008 13:30] Davi Arnaut
Second issue reported as Bug#40778
[17 Nov 2008 16:11] Paul Dubois
Noted in 5.1.31, 6.0.9 changelogs.

Use of InnoDB monitoring (SHOW ENGINE INNODB STATUS or one of the
InnoDB Monitor tables) could cause a server crash due to invalid
access to a shared variable in a concurrent environment.
[10 Jan 2009 21:10] bhushan uparkar
I picked up patches provided by Kristofer and applied on 5.1.30 release . But event with this patch I am getting the crash , here is the stack trace

thd: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = (nil) thread_stack 0x20000
/usr/sbin/mysqld(my_print_stacktrace+0x29) [0x879409]
/usr/sbin/mysqld(handle_segfault+0x346) [0x5d6766]
/lib/libpthread.so.0 [0x2b24fda02410]
/lib/libc.so.6(strlen+0x30) [0x2b24fe57a5a0]
/usr/sbin/mysqld(String::append(char const*)+0x1c) [0x5d24cc]
/usr/sbin/mysqld(thd_security_context+0x15c) [0x5c826c]
/usr/sbin/mysqld(innobase_mysql_print_thd+0x1d) [0x74f43d]
/usr/sbin/mysqld(lock_print_info_all_transactions+0x7e) [0x7a550e]
/usr/sbin/mysqld(srv_printf_innodb_monitor+0x186) [0x7ead76]
/usr/sbin/mysqld(srv_lock_timeout_and_monitor_thread+0x321) [0x7eb5a1]
/lib/libpthread.so.0 [0x2b24fd9fbf1a]
/lib/libc.so.6(__clone+0x72) [0x2b24fe5d15d2]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
090110 20:06:23 mysqld_safe Number of processes running now: 0

===================
Here are details related to our platform,

OS : Debian , 
Linux version 2.6.18-6-amd64 (Debian 2.6.18.dfsg.1-18etch1) (waldi@debian.org) (gcc version 4.1.2 20061115 (prerelease) (Debian
 4.1.1-21)) #1 SMP Sun Feb 10 17:50:19 UTC 2008

MySql Version: 5.1.30 

Let me know if you need additional details on this.

Thank you,
Bhushan
[10 Jan 2009 23:23] Davi Arnaut
Please provide your ./configure options, mysqld binary and core file.
[10 Jan 2009 23:51] Davi Arnaut
BTW, does your setup use replication/binlog?
[12 Jan 2009 11:22] Shane Bester
bug #42045 is a duplicate of this
[19 Jan 2009 11:34] Bugs System
Pushed into 5.1.31-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090119095303-uwwvxiibtr38djii) (version source revid:tomas.ulin@sun.com-20090108105244-8opp3i85jw0uj5ib) (merge vers: 5.1.31-ndb-6.2.17) (pib:6)
[19 Jan 2009 13:10] Bugs System
Pushed into 5.1.31-ndb-6.3.21 (revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (version source revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (merge vers: 5.1.31-ndb-6.3.21) (pib:6)
[19 Jan 2009 16:16] Bugs System
Pushed into 5.1.31-ndb-6.4.1 (revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (version source revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (merge vers: 5.1.31-ndb-6.4.1) (pib:6)
[2 Feb 2009 23:20] bhushan uparkar
The bug is still happening random times, and here is the stack trace for this
============
tack_bottom = (nil) thread_stack 0x20000
/usr/sbin/mysqld(my_print_stacktrace+0x29) [0x879409]
/usr/sbin/mysqld(handle_segfault+0x346) [0x5d6766]
/lib/libpthread.so.0 [0x2b9907f13410]
/lib/libc.so.6(strlen+0x30) [0x2b9908a8b5a0]
/usr/sbin/mysqld(String::append(char const*)+0x1c) [0x5d24cc]
/usr/sbin/mysqld(thd_security_context+0x15c) [0x5c826c]
/usr/sbin/mysqld(innobase_mysql_print_thd+0x1d) [0x74f43d]
/usr/sbin/mysqld(lock_print_info_all_transactions+0x7e) [0x7a550e]
/usr/sbin/mysqld(srv_printf_innodb_monitor+0x186) [0x7ead76]
/usr/sbin/mysqld(srv_lock_timeout_and_monitor_thread+0x321) [0x7eb5a1]
/lib/libpthread.so.0 [0x2b9907f0cf1a]
/lib/libc.so.6(__clone+0x72) [0x2b9908ae25d2]

================

Here are options for configure ,
=================================
Configure command: ./configure '--build=x86_64-linux-gnu' '--host=x86_64-linux-gnu' '--prefix=/usr' '--exec-prefix=/usr' '--libexecdir=/usr/sbin' '--datadir=/usr/share' '--localstatedir=/var/lib/mysql' '--includedir=/usr/include' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-server-suffix=-Debian_7etch5' '--with-comment=Debian etch distribution' '--enable-shared' '--enable-static' '--enable-thread-safe-client' '--enable-local-infile' '--with-big-tables' '--with-raid' '--with-unix-socket-path=/var/run/mysqld/mysqld.sock' '--with-mysqld-user=mysql' '--with-libwrap' '--with-vio' '--without-openssl' '--with-yassl' '--without-docs' '--with-bench' '--without-readline' '--with-extra-charsets=all' '--with-innodb' '--with-isam' '--with-archive-storage-engine' '--with-csv-storage-engine' '--with-federated-storage-engine' '--without-embedded-server' '--with-ndbcluster' '--with-ndb-shm' '--without-ndb-sci' '--without-ndb-test' '--with-embedded-server' '--with-embedded-privilege-control' '--with-ndb-docs' 'CC=gcc' 'CFLAGS=-DBIG_JOINS=1 -O2' 'CXXFLAGS=-DBIG_JOINS=1 -felide-constructors -fno-rtti -O2' 'CXX=g++' 'build_alias=x86_64-linux-gnu' 'host_alias=x86_64-linux-gnu'

=============

I do not have the core file at my end. Let me know if you have question/comments.

Thanks,
Bhushan
[19 Mar 2009 4:16] Trent Lloyd
bhushan,

What version of MySQL are you using?
[19 Mar 2009 11:01] Davi Arnaut
The 5.1 incarnation of the code that caused this problem doesn't exist in 5.0 -- in 5.0 it's implemented within the InnoDB storage engine and doesn't seem to have the problem that was fixed by this patch.

Trent, please open a new bug report if something like this bug is happening on 5.0. InnoDB guys should look at it.
[13 Apr 2009 22:15] bhushan uparkar
I installed 5.1.33 on one of our db server, and found the fix for this bug is working on this release. Here is related error log from mysql server ,

================================

InnoDB: Warning: a long semaphore wait: ...

===============================

After logging the above message, server was still up and functioning properly. In the past after above message, an assertion was thrown causing db to crash. 

Thanks for fix !
Bhushan
[16 Nov 2009 23:05] Harrison Fisk
I have seen this crash in 5.1.38 on Solaris.  I have a core file as well.

Here is the backtrace:

(dbx) print server_version
server_version = "5.1.38-enterprise-gpl-advanced"
(dbx) where
  [1] collect_a_seq(0xb, 0x1005ccb8a, 0x183220, 0xffffffff7ddd418c, 0x100925128, 0x0), at 0xffffffff7ddd80d4 
=>[2] handle_segfault(sig = ???) (optimized), at 0x100281164 (line ~2552) in "mysqld.cc"
  [3] _thrp_continue(0xb, 0x0, 0xffffffff7c73c4b0, 0x100280ebc, 0x0, 0x0), at 0xffffffff7ddd418c 
  [4] __clock_nanosleep(0xffffffff7f304a00, 0xffffffff7f304a00, 0xffffffff7c73c4b0, 0x12, 0x0, 0x0), at 0xffffffff7ddc7d28 
  [5] _sleep(0x0, 0x0, 0xffffffff7c73c4b0, 0xffffffff7f304a00, 0x10026d08c, 0xffffffff7df3a000), at 0xffffffff7ddc7f20 
  [6] String::append(this = ???, s = ???, arg_length = ???) (optimized), at 0x10027ce80 (line ~458) in "sql_string.cc"
  [7] thd_security_context(thd = ???, buffer = ???, length = ???, max_query_len = ???) (optimized), at 0x10026d08c (line ~385) in "sql_class.cc"
  [8] innobase_mysql_print_thd(f = ???, input_thd = ???, max_query_len = ???) (optimized), at 0x1004023f8 (line ~793) in "ha_innodb.cc"
  [9] trx_print(f = ???, trx = ???, max_query_len = ???) (optimized), at 0x10048c274 (line ~1762) in "trx0trx.c"
  [10] lock_print_info_all_transactions(file = ???) (optimized), at 0x100443cc0 (line ~4294) in "lock0lock.c"
  [11] srv_printf_innodb_monitor(file = ???, trx_start = ???, trx_end = ???) (optimized), at 0x10047a98c (line ~1694) in "srv0srv.c"
  [12] innodb_show_status(hton = ???, thd = ???, stat_print = ???) (optimized), at 0x10040ab10 (line ~7221) in "ha_innodb.cc"
  [13] innobase_show_status(hton = ???, thd = ???, stat_print = ???, stat_type = ???) (optimized), at 0x10040ae60 (line ~7386) in "ha_innodb.cc"
  [14] ha_show_status(thd = ???, db_type = ???, stat = ???) (optimized), at 0x100367fbc (line ~4416) in "handler.cc"
  [15] mysql_execute_command(thd = ???) (optimized), at 0x100290aac (line ~2425) in "sql_parse.cc"
  [16] mysql_parse(thd = ???, inBuf = ???, length = ???, found_semicolon = ???) (optimized), at 0x100296c58 (line ~5931) in "sql_parse.cc"
  [17] dispatch_command(command = ???, thd = ???, packet = ???, packet_length = ???) (optimized), at 0x10028e390 (line ~1213) in "sql_parse.cc"
  [18] do_command(thd = ???) (optimized), at 0x10028dac8 (line ~854) in "sql_parse.cc"
  [19] handle_one_connection(arg = ???) (optimized), at 0x10028c648 (line ~1127) in "sql_connect.cc"
(dbx) frame 7
Current function is thd_security_context (optimized)
dbx: warning: File `sql_class.cc' has been modified more recently than `mysqld'
  385       str.append(thd->query, len);
[17 Nov 2009 12:42] Davi Arnaut
The latest crash is a regression introduced by Bug#38816 (pushed to 5.1.38). It changed the lock that protects THD::query from LOCK_thread_count to LOCK_thd_data, but didn't update the associated innodb functions (innobase_mysql_prepare_print_arbitrary_thd and innobase_mysql_end_print_arbitrary_thd).
[1 Dec 2009 10:00] Georgi Kodinov
This was a regression from 5.0. changing the target version
[1 Dec 2009 11:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/92252

2857 Gleb Shchepa	2009-12-01
      Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
      
      The bug 38816 changed the lock that protects THD::query from
      LOCK_thread_count to LOCK_thd_data, but didn't update the associated
      InnoDB functions.
      
      1. The innobase_mysql_prepare_print_arbitrary_thd and the
      innobase_mysql_end_print_arbitrary_thd InnoDB functions have been
      removed, since now we have a per-thread mutex: now we don't need to wrap
      several inter-thread access tries to THD::query with a single global
      LOCK_thread_count lock, so we can simplify the code.
      
      2. The innobase_mysql_print_thd function has been modified to lock
      LOCK_thd_data in direct way.
[1 Dec 2009 11:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/92256

3235 Gleb Shchepa	2009-12-01 [merge]
      Bug #38883 (reopened): thd_security_context is not thread safe, crashes?
      
      manual merge 5.0-->5.1, updating InnoDB plugin.
[2 Dec 2009 8:01] Bugs System
Pushed into 5.0.89 (revid:joro@sun.com-20091202075830-mzl79q7mc1v72pf1) (version source revid:gshchepa@mysql.com-20091201102444-yw166t3audrojo9s) (merge vers: 5.0.89) (pib:13)
[2 Dec 2009 8:05] Bugs System
Pushed into 5.1.42 (revid:joro@sun.com-20091202080033-mndu4sxwx19lz2zs) (version source revid:satya.bn@sun.com-20091201121857-xw3tu44qm3hiiowr) (merge vers: 5.1.42) (pib:13)
[3 Dec 2009 19:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/92763

3243 Gleb Shchepa	2009-12-03
      Bug #38883: thd_security_context is not thread safe, crashes?
      
      After-push minor code cleanup for WL 2360: unnecessary external
      reference to LOCK_thread_count has been removed from ha_innodb.cc.
[16 Dec 2009 2:08] Paul Dubois
Noted in 5.0.89 changelog.

Already fixed in 5.1.x.
[16 Dec 2009 8:36] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091216083311-xorsasf5kopjxshf) (version source revid:alik@sun.com-20091214191830-wznm8245ku8xo702) (merge vers: 6.0.14-alpha) (pib:14)
[16 Dec 2009 8:43] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091216082430-s0gtzibcgkv4pqul) (version source revid:gshchepa@mysql.com-20091204102018-a61t8ttaj9xe94wp) (merge vers: 5.5.0-beta) (pib:14)
[16 Dec 2009 8:49] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20091216083231-rp8ecpnvkkbhtb27) (version source revid:alik@sun.com-20091212203859-fx4rx5uab47wwuzd) (merge vers: 5.6.0-beta) (pib:14)
[16 Dec 2009 15:43] Paul Dubois
Noted in 5.5.1 changelog.

Already fixed in 6.0.x.
[16 Dec 2009 21:06] Paul Dubois
The 5.1.42 fix was an *additional* fix. Changelog entry:

Use of InnoDB monitoring (SHOW ENGINE INNODB STATUS or one of the 
InnoDB Monitor tables) could cause a server crash due to invalid
access to a shared variable in a concurrent environment. This is a
further fix for a regression introduced in MySQL 5.1.38 to the
original fix in MySQL 5.1.31.
[19 Dec 2009 8:28] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091219082307-f3i4fn0tm8trb3c0) (version source revid:alik@sun.com-20091216180721-eoa754i79j4ssd3m) (merge vers: 6.0.14-alpha) (pib:15)
[19 Dec 2009 8:31] Bugs System
Pushed into 5.5.1-m2 (revid:alik@sun.com-20091219082021-f34nq4jytwamozz0) (version source revid:alexey.kopytov@sun.com-20091211164058-ycpe0f20d1c4h1gl) (merge vers: 5.5.0-beta) (pib:15)
[19 Dec 2009 8:35] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20091219082213-nhjjgmphote4ntxj) (version source revid:alik@sun.com-20091216180221-a5ps59gajad3pip9) (pib:15)
[15 Jan 2010 8:59] Bugs System
Pushed into 5.1.43 (revid:joro@sun.com-20100115085139-qkh0i0fpohd9u9p5) (version source revid:gshchepa@mysql.com-20091203193809-fpekyz16qbfrs3j1) (merge vers: 5.1.42) (pib:16)
[17 Feb 2010 16:18] Paul Dubois
Noted in 5.0.87sp1 changelog.
[20 Feb 2010 17:10] Bugs System
Pushed into 5.0.91 (revid:build@mysql.com-20100220170835-5kr6ztsg25va7qzz) (version source revid:build@mysql.com-20100220170835-5kr6ztsg25va7qzz) (merge vers: 5.0.91) (pib:16)
[1 Mar 2010 8:43] Bugs System
Pushed into 5.1.45 (revid:joro@sun.com-20100301083827-xnimmrjg6bh33o1o) (version source revid:joro@sun.com-20100226131646-kpvzk740hxbtaexn) (merge vers: 5.1.45) (pib:16)
[2 Mar 2010 14:35] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100302142746-u1gxdf5yk2bjrq3e) (version source revid:alik@sun.com-20100301095421-4cz64ibem1h2quve) (merge vers: 6.0.14-alpha) (pib:16)
[2 Mar 2010 14:40] Bugs System
Pushed into 5.5.3-m2 (revid:alik@sun.com-20100302072233-t3uqgjzdukt1pyhe) (version source revid:alik@sun.com-20100301090215-63o2w2y16go8n53p) (merge vers: 5.5.3-m2) (pib:16)
[2 Mar 2010 14:45] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100302072432-k8xvfkgcggkwgi94) (version source revid:alik@sun.com-20100301094536-2zc4uqyy3os8san7) (pib:16)
[12 Mar 2010 14:20] Bugs System
Pushed into 5.1.44-ndb-7.0.14 (revid:jonas@mysql.com-20100312135944-t0z8s1da2orvl66x) (version source revid:jonas@mysql.com-20100312115609-woou0te4a6s4ae9y) (merge vers: 5.1.44-ndb-7.0.14) (pib:16)
[12 Mar 2010 14:34] Bugs System
Pushed into 5.1.44-ndb-6.2.19 (revid:jonas@mysql.com-20100312134846-tuqhd9w3tv4xgl3d) (version source revid:jonas@mysql.com-20100312060623-mx6407w2vx76h3by) (merge vers: 5.1.44-ndb-6.2.19) (pib:16)
[12 Mar 2010 14:50] Bugs System
Pushed into 5.1.44-ndb-6.3.33 (revid:jonas@mysql.com-20100312135724-xcw8vw2lu3mijrhn) (version source revid:jonas@mysql.com-20100312103652-snkltsd197l7q2yg) (merge vers: 5.1.44-ndb-6.3.33) (pib:16)
[5 May 2010 15:05] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 15:59] Paul Dubois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[28 May 2010 6:06] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:35] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 7:02] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 19:21] Paul Dubois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[15 Jun 2010 8:09] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (merge vers: 5.1.47) (pib:16)
[15 Jun 2010 8:25] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (pib:16)
[17 Jun 2010 12:11] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:58] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:39] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[22 Dec 2010 7:42] Meiji KIMURA
Duplicated Bug#53092.
[8 Jun 2011 11:14] Shane Bester
the fix for this introduced bug #60682 in 5.1.42.