Bug #76351 gen_lex_hash fails with mem.leak errors if compiled with clang-3.5 + ASAN
Submitted: 17 Mar 2015 14:11 Modified: 14 May 2015 15:21
Reporter: Gleb Shchepa Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Parser Severity:S3 (Non-critical)
Version:5.5+ OS:Any
Assigned to: CPU Architecture:Any

[17 Mar 2015 14:11] Gleb Shchepa
Description:
gen_lex_hash is a compile-time utility that generates keyword hashes for MySQL parser (i.e. for its lexical scanner part).
It never cared about freeing of allocated memory before, since it executes only once at compile time, and OS frees that process memory heap without any issues in a most efficient and fast way.

However, since 5.5 we support builds with an address sanitizer (ASAN) option.
Also, since clang compiler emulates GCC compiler's interfaces, someone can build MySQL with clang.
AddressSanitizer (ASAN) in clang-3.6+ has an experimental support for a memory leak detector (LeakSanitizer).
That leak detector aborts execution of the gen_lex_hash utility and output errors (for mysql-trunk):

=================================================================                
==19872==ERROR: LeakSanitizer: detected memory leaks                             
                                                                                 
Direct leak of 10776 byte(s) in 21 object(s) allocated from:                     
    #0 0x4beba5 in realloc (sql/gen_lex_hash+0x4beba5)                           
    #1 0x4e0ea1 in insert_into_hash(hash_lex_struct*, char const*, int, int) sql/gen_lex_hash.cc:195:30
    #2 0x4e1b5e in hash_map_info::insert_symbols(int) sql/gen_lex_hash.cc:218:5  
    #3 0x4e45ef in hash_map_info::print_hash_map(char const*, unsigned int) sql/gen_lex_hash.cc:278:3
    #4 0x4e5f47 in main sql/gen_lex_hash.cc:345:7                                
    #5 0x7f2179264ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
                                                                                 
Direct leak of 10776 byte(s) in 21 object(s) allocated from:                     
    #0 0x4beba5 in realloc (sql/gen_lex_hash+0x4beba5)                           
    #1 0x4e0ea1 in insert_into_hash(hash_lex_struct*, char const*, int, int) sql/gen_lex_hash.cc:195:30
    #2 0x4e1b5e in hash_map_info::insert_symbols(int) sql/gen_lex_hash.cc:218:5  
    #3 0x4e45ef in hash_map_info::print_hash_map(char const*, unsigned int) sql/gen_lex_hash.cc:278:3
    #4 0x4e6070 in main sql/gen_lex_hash.cc:351:7                                
    #5 0x7f2179264ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
                                                                                 
Direct leak of 288 byte(s) in 1 object(s) allocated from:                        
    #0 0x4beba5 in realloc (sql/gen_lex_hash+0x4beba5)                           
    #1 0x4e0ea1 in insert_into_hash(hash_lex_struct*, char const*, int, int) sql/gen_lex_hash.cc:195:30
    #2 0x4e1b5e in hash_map_info::insert_symbols(int) sql/gen_lex_hash.cc:218:5  
    #3 0x4e45ef in hash_map_info::print_hash_map(char const*, unsigned int) sql/gen_lex_hash.cc:278:3
    #4 0x4e6199 in main sql/gen_lex_hash.cc:358:7                                
    #5 0x7f2179264ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
                                                                                 
Direct leak of 24 byte(s) in 1 object(s) allocated from:                         
    #0 0x4be862 in malloc (sql/gen_lex_hash+0x4be862)                            
    #1 0x4dfe5c in insert_into_hash(hash_lex_struct*, char const*, int, int) sql/gen_lex_hash.cc:167:30
    #2 0x4e1b5e in hash_map_info::insert_symbols(int) sql/gen_lex_hash.cc:218:5  
    #3 0x4e45ef in hash_map_info::print_hash_map(char const*, unsigned int) sql/gen_lex_hash.cc:278:3
    #4 0x4e6199 in main sql/gen_lex_hash.cc:358:7                                
    #5 0x7f2179264ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
 
etc.

How to repeat:
install clang-3.6
export CC=clang-3.6
export CXX=clang++-3.6
export LD=$CXX
export ASAN_SYMBOLIZER_PATH=$(which llvm-symbolizer-3.6)
cmake . -DWITH_DEBUG=1 -DWITH_ASAN=1
make

Suggested fix:
For mysql-trunk:

diff --git a/sql/gen_lex_hash.cc b/sql/gen_lex_hash.cc
index aade577..c1f52df 100644
--- a/sql/gen_lex_hash.cc
+++ b/sql/gen_lex_hash.cc
@@ -101,6 +101,16 @@ struct hash_lex_struct
     int iresult;
   };
   int ithis;
+
+  void destroy()
+  {
+    if (first_char <= 0)
+      return;
+    for (int i= 0, size= static_cast<uchar>(last_char) - first_char + 1;
+         i < size; i++)
+      char_tails[i].destroy();
+    free(char_tails);
+  }
 };
 
 
@@ -119,6 +129,8 @@ public:
 
   ~hash_map_info()
   {
+    for (int i= 0; i < max_len; i++)
+      root_by_len[i].destroy();
     free(root_by_len);
     free(hash_map);
   }
[14 May 2015 15:21] Paul DuBois
Noted in 5.7.8, 5.8.0 changelogs.

Compiling using Clang with AddressSanitizer (ASAN) enabled caused the
gen_lex_hash utility to abort with memory leak check failures.
[27 Aug 2015 4:21] Erlend Dahl
Bug#74540 Memory leaks in gen_lex_hash.cc

was marked as a duplicate.