Bug #21554 sp_cache.cc: violates C++ aliasing rules
Submitted: 9 Aug 2006 21:52 Modified: 4 Jun 2007 18:17
Reporter: Christian Hammers (Silver Quality Contributor) (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Compiling Severity:S3 (Non-critical)
Version:5.0.24 OS:Linux (Debian GNU/Linux Sid)
Assigned to: Marc ALFF CPU Architecture:Any
Tags: rt_q1_2007

[9 Aug 2006 21:52] Christian Hammers
Description:
As reported on http://bugs.debian.org/382253 by Martin Michlmayr <tbm@cyrius.com>
---------------------------------------------------------------------------------

Compiling the archive with a pre-release of GCC 4.2, I found a
compiler bug while building mysql-dfsg-5.0.  I forwarded a testcase
based on the mysql-dfsg-5.0 code (sp_cache.cc) and while there
definitely is a compiler bug that needs to be fixed, the GCC people
also said that the mysql code is invalid, that is, that it violates
the C++ aliasing rules.

See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28544 in which a GCC
developer says, "I can trivially fix this [GCC bug], but the code
isn't going to do what you want when i'm done, since it is an aliasing
violation :)"

How to repeat:
Build with GCC 4.2

Suggested fix:
don't know
[26 Aug 2006 11:53] Valeriy Kravchuk
Thank you for a problem report. Can you, please, send a testcase and point out where the violation is?
[26 Aug 2006 12:31] Christian Hammers
Hi

Regarding the test case, just take a look at the referenced GCC bug report, the file seems so small that you can probably use it as test case without modification. It it compiles with GCC-4.2 the bug is solved :)

Regarding the nature of the bug or how to fix it, I don't know C++ good enough.
Maybe ask <pinskia@gcc.gnu.org> who claimed it to be a C++ standards violation.
From the function and file names I guess its in sql/sp_cache.cc's sp_cache_invalidate.

bye,

-christian-
[22 Sep 2006 12:12] Valeriy Kravchuk
In 5.0.26-BK in sp_cache.cc we have:

void sp_cache_invalidate()
{
  DBUG_PRINT("info",("sp_cache: invalidating"));
  thread_safe_increment(Cversion, &Cversion_lock);
}

thread_safe_increment is defined in my_pthread.h as follows:

#define thread_safe_increment(V,L) atomic_inc((atomic_t*) &V)

Offending atomic_t and atomic_inc are defined in /usr/include/asm/atomic.h. So, it is not a bug in MySQL's code any more, in any case.
[6 Oct 2006 13:06] Sergei Golubchik
MySQL shouldn't use kernel include files if it breaks the compilation.
(it shouldn't use them at all, but that's another story, and another fix)

We have a test in configure whether using atomic.h is safe. Apparently it is not robust enough.
[6 Oct 2006 13:18] Valeriy Kravchuk
So, we have a verified bug here, in configure checks, when GCC 4.2 is used.
[19 Dec 2006 5:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/17141

ChangeSet@1.2320, 2006-12-18 22:52:49-07:00, malff@weblab.(none) +5 -0
  Bug#21554 (sp_cache.cc: violates C++ aliasing rules)
  
  This is a compile bug, reported by the development GCC team with GCC 4.2
  
  The issue is that on linux platforms, for which HAVE_ATOMIC_ADD is defined,
  casting a pointer from an 'unsigned long*' to 'atomic_t*' violates
  assumptions made by GCC when optimizing code, which are that pointers to
  different types do not point to the same memory.
  In case of atomic_t, which is found in linux platforms under the following
  header files (for example, these are the headers on my machine):
  - /usr/include/asm-i386/atomic.h
  - /usr/include/asm-x86_64/atomic.h
  the comments around the definition of atomic_t are very specific:
  "
   atomic_t should be 32 bit signed type
  [...]
   Make sure gcc doesn't try to be clever and move things around
   on us. We need to use _exactly_ the address the user gave us,
   not some alias that contains the same information.
  "
   typedef struct { volatile int counter; } atomic_t;
  
  For sp_cache.cc in MySQL, the code performs the following cast when using
  the variable Cversion:
  - static ulong volatile CVersion --> atomic_t
  
  While it is unknown if this cast really can cause a bug or not, assuming
  that "ulong volatile" and atomic_t, which are maintained in different
  spaces (MySQL and the Linux per architecture asm header files), really have:
  - the same storage qualifier (they don't : volatile, or volatile member),
  - the same sign (they don't),
  - the same size,
  - the same alignment,
  is poor and can be improved.
  
  With this fix, the code has been changed as follows:
  
  A new opaque type, thread_safe_counter_t, has been defined for use with
  the existing thread_safe_xxx helpers.
  When the thread_safe_xxx helpers are implemented using atomic_t (Linux),
  the cast to "atomic_t*" has been removed, which forces the caller to
  actually provide a thread_safe_counter_t variable, or fail to compile.
  This is a step towards:
  - removing the cast, which resolves the "C++ aliasing rule violation",
  - making the code more strongly typed, and easier to maintain.
  
  For example, the following invalid uses will fail to build, where the same
  constructs would previously be compiled and produce bugs at runtime:
    thread_safe_counter_t my_cnt;
    my_cnt = 0; // build break
    my_cnt ++; // build break, previously would ignore thread safety
    foo = my_cnt; // build break
  Instead, the caller is forced to use the thread safe APIs:
    thread_safe_counter_t my_cnt;
    thread_safe_counter_set(my_cnt, 0);
    thread_safe_increment(my_cnt); // safe, uses atomic_add
    foo = thread_safe_counter_read(my_cnt);
  
  Atomic functions were also used previously when building MySQL with
  #define SAFE_STATISTICS, for statistic_increment counters.
  
  It is unclear whether SAFE_STATISTICS is truly supported or not,
  since this compile flag is undocumented, and is not set by the autoconf
  configure script.
  Looking at the implementation of fill_status() in sql_show.cc however,
  it appears that the intent of SAFE_STATISTICS is to prevent threads to
  modify statistics counters located in thd->status_var while the function
  calc_sum_of_all_status() aggregates counters from all threads.
  This is achieved by holding the LOCK_status during the aggregation.
  
  The previous implementation of statistic_increment, for platforms providing
  atomic operations (HAVE_ATOMIC_ADD) was however not honoring the mutex,
  making calls to statistic_increment(thd->status_var.xxx, &LOCK_status)
  *unsafe* even in SAFE_STATISTICS builds.
  
  With this change, statistic_increment always honor the mutex.
  This makes statistics safe in SAFE_STATISTICS builds, and also prevents
  the need to introduce an opaque type similar to thread_safe_counter_t
  for every non critical counter.
  
  Building with SAFE_STATISTICS defined has been tested, and the test suite
  passes with this flag (on a platform with #define HAVE_ATOMIC_ADD).
  
  The following counters:
  - delayed_insert_writes,
  - delayed_rows_in_use,
  - delayed_insert_errors
  were previously implemented using thread_safe_increment() instead of
  statistic_increment(), and as a result are now using thread_safe_counter_t
  to comply with the original behavior (the counters are always safe).
[22 Feb 2007 23:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/20430

ChangeSet@1.2419, 2007-02-22 16:13:45-07:00, malff@weblab.(none) +18 -0
  Bug#21554 (sp_cache.cc: violates C++ aliasing rules)
  
  This is a compile bug, reported by the development GCC team with GCC 4.2
  
  The issue is that on linux platforms, for which HAVE_ATOMIC_ADD is defined,
  casting a pointer from an 'unsigned long*' to 'atomic_t*' violates
  assumptions made by GCC when optimizing code, which are that pointers to
  different types do not point to the same memory.
  In case of atomic_t, which is found in linux platforms under the following
  header files (for example, these are the headers on my machine):
  - /usr/include/asm-i386/atomic.h
  - /usr/include/asm-x86_64/atomic.h
  the comments around the definition of atomic_t are very specific:
  "
   atomic_t should be 32 bit signed type
  [...]
   Make sure gcc doesn't try to be clever and move things around
   on us. We need to use _exactly_ the address the user gave us,
   not some alias that contains the same information.
  "
   typedef struct { volatile int counter; } atomic_t;
  
  In MySQL, usage of atomic_t derives from the macro thread_safe_increment.
  
  For sp_cache.cc in MySQL, the code performs the following cast when using
  the variable Cversion:
  - static ulong volatile CVersion --> atomic_t
  
  There are also other issues with using atomic_t from the linux kernel headers
  directly in user code:
  - user code (MySQL) should not include <asm/atomic.h>
  - the code depends on a i386 on CONFIG_SMP,
  which creates the possibility of build issues when a build machine and a
  production machine have mismatched CONFIG_SMP configurations.
  
  There are several issues related to thread_safe_increment that existed before
  this fix:
  
  1)
  The same locking mechanism should always be used when
  guarding a given data, mixing mutexes and atomic functions is not safe.
  
  In particular, the following counters were not consistently protected:
  - sql/ha_berkeley.cc, share->rows is not safe (should be protected by
  share->mutex)
  - mysys/my_fopen.c, my_stream_opened (THR_LOCK_open)
  - mysys/my_open.c, my_file_opened (THR_LOCK_open)
  
  Using sometime the mutex listed and sometime thread_safe_increment
  would actually cause the counter to not be safe on linux.
  
  2)
  Some counters, which are really statistical in nature, would use
  thread_safe_increment instead of statistic_increment,
  making the production builds (SAFE_STATISTICS not defined) slower.
  
  In particular, the following counters :
  - mysys/mf_tempfile.c, my_tmp_file_created
  - sql/sql_insert.c, delayed_rows_in_use
  - sql/sql_insert.c, delayed_insert_writes
  - sql/sql_insert.c, delayed_insert_errors
  would always use a lock.
  
  3)
  Every counter involving thd->status_var, was using locks (in SAFE_STATISTICS
  builds), even when not necessary, since the counter is owned by the running
  thread.
  
  With this fix, usage of thread_safe_increment has been revised.
  
  1) has been fixed by always use explicitly the mutex, and avoid relying
  on thread_safe_increment.
  
  2) has been fixed by replacing some calls to thread_safe_increment
  by calls to statistic_increment,
  
  3) has been fixed by using status_var_increment instead of
     statistic_increment.
  
  After fixing these 3 existing issues, re-evaluating the real usage pattern
  of atomic_t shows that, in a production build (SAFE_STATISTICS not defined),
  there is only one counter in the code that still uses atomic_t api,
  namely Cversion in sql/sp_cache.cc.
  
  Because of the build problem introduced with CONFIG_SMP, the
  thread_safe_increment api has been changed to not rely on atomic_t at all.
  Once atomic apis that are safe to use in user code are available, and safe
  to use without special build considerations (CONFIG_SMP), using atomic helpers
  can be implemented again, so this change is temporary only.
  
  Considering that the cleanup implemented actually avoid locking counters
  when it's not necessary, this change should actually improve performances,
  both in SAFE_STATISTIC builds (lots of locks are not needed) and non
  SAFE_STATISTICS builds (locking of insert delayed related counters and
  my_tmp_file_created not needed).
  
  No automated tests are provided, due to the nature of the problem.
[7 Mar 2007 20:56] Konstantin Osipov
Not setting to "Approved" as Serg set it back to In Progress.
Serg, do you agree with my code review?
[7 Mar 2007 20:56] Konstantin Osipov
Code review sent by email.
[21 Mar 2007 0:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/22422

ChangeSet@1.2472, 2007-03-20 18:21:53-06:00, malff@weblab.(none) +13 -0
  Bug#21554 (sp_cache.cc: violates C++ aliasing rules)
  
  The problem reported is a compile bug,
  reported by the development GCC team with GCC 4.2.
  
  The original issue can no longer be reproduced in MySQL 5.1,
  since the configure script no longer define HAVE_ATOMIC_ADD,
  which caused the Linux atomic functions to be used (and cause a problem
  with an invalid cast).
  
  This patch implements some code cleanup for 5.1 only, which was identified
  during the investigation of this issue.
  
  With this patch, statistics maintained in THD::status_var are by definition
  owned by the running thread, and do not need to be protected against race
  conditions. These statistics are maintained by the status_var_* helpers,
  which do not require any lock.
  
  At this point, implementing thread_safe_statistics using my_atomic_add apis
  can not be implemented, since 64 bits atomic operations are not implemented.
  64 bits counters are needed to support current ulong counters on 64 bits
  platforms.
[21 Mar 2007 21:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/22543

ChangeSet@1.2472, 2007-03-21 15:49:03-06:00, malff@weblab.(none) +13 -0
  Bug#21554 (sp_cache.cc: violates C++ aliasing rules)
  
  The problem reported is a compile bug,
  reported by the development GCC team with GCC 4.2.
  
  The original issue can no longer be reproduced in MySQL 5.1,
  since the configure script no longer define HAVE_ATOMIC_ADD,
  which caused the Linux atomic functions to be used (and cause a problem
  with an invalid cast).
  
  This patch implements some code cleanup for 5.1 only, which was identified
  during the investigation of this issue.
  
  With this patch, statistics maintained in THD::status_var are by definition
  owned by the running thread, and do not need to be protected against race
  conditions. These statistics are maintained by the status_var_* helpers,
  which do not require any lock.
[29 Mar 2007 14:47] Sergei Golubchik
ok to push, but please solve a technical problem of a merge with wl#2936 first.
[9 Apr 2007 22:00] Marc ALFF
Documenting a work around while waiting for this fix to be merged into the
main code base:

When building from the source code,
after the ./configure step,
manually edit the file config.h generated by the configure script,
and :

- comment the "#define HAVE_ATOMIC_ADD" line
- comment the "#define HAVE_ATOMIC_SUB" line (for consistency)

then proceed with the build (make clean; make).

Doing this is safe, the server will implement atomic operations using
a mutex, as is the case today for non-linux platforms.
[22 May 2007 20:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/27169

ChangeSet@1.2530, 2007-05-22 13:41:40-06:00, malff@weblab.(none) +8 -0
  Bug#21554 (sp_cache.cc: violates C++ aliasing rules)
  
  The problem reported is a compile bug,
  reported by the development GCC team with GCC 4.2.
  
  The original issue can no longer be reproduced in MySQL 5.1,
  since the configure script no longer define HAVE_ATOMIC_ADD,
  which caused the Linux atomic functions to be used (and cause a problem
  with an invalid cast).
  
  This patch implements some code cleanup for 5.1 only, which was identified
  during the investigation of this issue.
  
  With this patch, statistics maintained in THD::status_var are by definition
  owned by the running thread, and do not need to be protected against race
  conditions. These statistics are maintained by the status_var_* helpers,
  which do not require any lock.
[1 Jun 2007 19:25] Bugs System
Pushed into 5.1.20-beta
[4 Jun 2007 18:17] Paul DuBois
No changelog entry needed.