Bug #69653 Use of uninitialized pthread_getspecific() key in debug builds
Submitted: 3 Jul 2013 5:34 Modified: 1 Oct 2013 15:41
Reporter: Alexey Kopytov Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Logging Severity:S3 (Non-critical)
Version:5.6 OS:Any
Assigned to: Marc ALFF CPU Architecture:Any

[3 Jul 2013 5:34] Alexey Kopytov
Description:
This is similar to http://bugs.mysql.com/bug.php?id=65946. The problem
in that bug was static initialization code in debug builds calling dbug
functions before the dbug subsystem is initialized. 

More specifically, most _db_* functions require a thread-specific value
to be initialized before they are called. That value is only initialized
in my_thread_global_init() called from main(), i.e. after static
initializations take place. So if a constructor for a static object
calls dbug functions, my_thread_var_dbug() will use uninitialized
THR_KEY_mysys to access a thread-specific value. In most cases that
returns NULL, so everything works. However, under some unclear
circumstances pthread_getspecific() with an uninitialized key returns a
non-NULL value and that leads to a crash on startup.

Now the way http://bugs.mysql.com/bug.php?id=65946 has been fixed is
that instead of applying the contributed patch (which IMO may need some
improvements, but is correct in general), the developer chose to avoid
static initializations calling dbug functions in that specific
case. Which means it's only a matter of time before someone else
introduces code which does the same and resurrects the problem. This is
what happened in the following revision:

---
revno: 4350
committer: Dmitry Lenev <Dmitry.Lenev@oracle.com>
branch nick: mysql-5.6-14569140
timestamp: Thu 2012-09-27 18:54:18 +0400
message:
  Fix for bug #14569140 "MYSQL 5.6.6-M9 HAS MORE MUTEX CONTENTION
  THAN MYSQL 5.1".
  
  MySQL 5.6 has shown somewhat worse performance than MySQL 5.1
  for SysBench read-only/InnoDB workload against several tables
  for low number of connections.
  
  This issue was caused by the contention in MDL subsystem
  introduced in 5.5. The particular source of contention was
  MDL_map::m_mutex which protects hash containing all MDL_lock
  objects in the system.
  
  This patch addresses the problem by partitioning this hash
  and corresponding mutex by key for MDL_lock objects, thus
  allowing connections accessing different tables/objects to
  work with different partitions and therefore avoid contention.
  
  New start-up parameter --metadata-locks-hash-instances is
  introduced which allows to specify number of such partitions
  (with 8 partitions as default value).
---

Which added m_partitions to the MDL_map class. Which is initialized in
the static context, because there's a static mdl_locks object of that
class. m_partitions is of the Dynamic_array<MDL_map_partition *> type,
which eventually calls init_dynamic_array2() and thus, _db_enter_() in
its constructor. That's where we crash if pthread_getspecific() returns
a non-NULL value for an uninitialized key.

How to repeat:
It's hard to repeat that, because as I wrote the circumstances to make
pthread_getspecific() return a non-NULL value on uninitialized key are
unclear. I currently observe a 100% reproducible crash in debug
XtraBackup builds after a introducing a completely unrelated change.

It's best to verify it by code analysis or with a debugger. This is on
5.6.12:

$ gdb --args bin/mysqld --gdb --basedir=.

(gdb) br my_thread_var_dbug
Breakpoint 1 at 0x1005a86b4: file my_thr_init.c, line 466.
(gdb) r

Breakpoint 1, my_thread_var_dbug () at my_thr_init.c:466
466	    my_pthread_getspecific(struct st_my_thread_var*,THR_KEY_mysys);

# So we are going to use THR_KEY_mysys, but it's not initialized yet:

(gdb) p my_thread_global_init_done 
$1 = 0 '\0'

Suggested fix:
Suggested fix:

Either merge the contributed patch for
http://bugs.mysql.com/bug.php?id=65946 or make sure no dbug functions
are called in static initialization.
[7 Jul 2013 11:44] Marc ALFF
This bug has been fixed in 5.7 already, see in particular:

http://bazaar.launchpad.net/~mysql/mysql-server/5.7/view/head:/mysys/my_thr_init.c#L471

extern void **my_thread_var_dbug()
{
  struct st_my_thread_var *tmp;
  /*
    Instead of enforcing DBUG_ASSERT(THR_KEY_mysys_initialized) here,
    which causes any DBUG_ENTER and related traces to fail when
    used in init / cleanup code, we are more tolerant:
    using DBUG_ENTER / DBUG_PRINT / DBUG_RETURN
    when the dbug instrumentation is not in place will do nothing.
  */
  if (! THR_KEY_mysys_initialized)
    return NULL;
  tmp= _my_thread_var();
  return tmp && tmp->init ? &tmp->dbug : 0;
}

The revision that fixed that is:

http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/5107.1.280

    Committer: Marc Alff
    Date: 2013-01-16 16:38:07 UTC
    mto: (4526.1.81 mysql-trunk-wl6469) (4423.1.10 wl6455-0218) (4764.1.131 mysql-trunk-wl6560) (5107.21.1 trunk6)
    mto: This revision was merged to the branch mainline in revision 5112.
    Revision ID: marc.alff@oracle.com-20130116163807-lbe97qhzvv62qkcb

Bug#16175473 TUNING, USE STATIC CALLS FOR THE PERFORMANCE SCHEMA INSTRUMENTATION

This change is a performance tuning improvement.

Before this fix, a line of code instrumented as:
  PSI_MUTEX_CALL(start_mutex_wait)(...);
would always be compiled using a indirect call, using a function pointer:
  PSI_server->start_mutex_wait(...);

This is expected when building code for dynamic plugins,
but inefficient when building code inside the server itself.

With this fix, the same line of code is now compiled as either
  PSI_server->start_mutex_wait(...);
or
  pfs_start_mutex_wait_v1(...);
depending of whether a dynamic or static call is needed.

The net result is that instrumentation calls for the server code itself
can now be compiled staticly.

The server code has been adjusted to use static calls for the instrumentation
interface, which affected the server initialization code.
As part of this change, helper functions are defined to wrap access to thread
local storage, so that detecting invalid TLS usage is easier.

---

The fix is not present in 5.6.
[1 Oct 2013 15:41] Paul DuBois
Noted in 5.6.15, 5.7.3 changelogs.

In debug builds, static initialization code could call DBUG functions
before the DBUG subsystem was initialized.
[4 Dec 2013 11:23] Laurynas Biveinis
5.6$ bzr log -r 5411 -n0
------------------------------------------------------------
revno: 5411 [merge]
committer: Marc Alff <marc.alff@oracle.com>
branch nick: mysql-5.6-push
timestamp: Wed 2013-08-28 11:02:34 +0200
message:
  Merge to mysql-5.6 (backport)
    ------------------------------------------------------------
    revno: 5285.1.2 [merge]
    committer: Marc Alff <marc.alff@oracle.com>
    branch nick: mysql-5.6-bug17063675
    timestamp: Mon 2013-08-26 11:55:13 +0200
    message:
      Local merge
    ------------------------------------------------------------
    revno: 5285.1.1
    committer: Marc Alff <marc.alff@oracle.com>
    branch nick: mysql-5.6-bug17063675
    timestamp: Sun 2013-07-07 16:16:00 +0200
    message:
      Bug#17063675 USE OF UNINITIALIZED PTHREAD_GETSPECIFIC() KEY IN DEBUG BUILDS
      
      Before this fix, code that uses DBUG_ENTER() and related macros,
      when executed very early in the server initialization, typically
      when using C++ constructors in static code, could end up executing
      my_pthread_getspecific() with a non initialized pthread_key.
      
      The behavior in this case is undefined, and can potentially cause a crash.
      
      This fix is a backport from MySQL 5.7,
      which uses a THR_KEY_mysys_initialized variable to keep track of 
      the initialization state.
      
      With this fix, calling DBUG_ENTER() and related macro is more robust,
      and does nothing when the debug component is not available.