Bug #93165 Memory leak in sync_latch_meta_init() after mysqld shutdown detected by ASan
Submitted: 12 Nov 2018 15:11 Modified: 26 Nov 2018 21:02
Reporter: Yura Sorokin (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.7.23, 5.7.24 OS:Any
Assigned to: CPU Architecture:Any

[12 Nov 2018 15:11] Yura Sorokin
Description:
The following memory leak is detected by ASan during mysqld shutdown

Direct leak of 144 byte(s) in 1 object(s) allocated from:
    #0 0x7ff095aa7b50 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb50)
    #1 0x5585f6447cb8 in ut_allocator<unsigned char>::allocate(unsigned long, unsigned char const*, char const*, bool, bool) (/home/yura/addon/mysql-build-5.7-asan_scope/sql/mysqld-debug+0x29d8cb8)
    #2 0x5585f688b693 in sync_latch_meta_init /mnt/hgfs/repos/mysql-server/storage/innobase/sync/sync0debug.cc:1510
    #3 0x5585f68a5102 in sync_check_init() /mnt/hgfs/repos/mysql-server/storage/innobase/sync/sync0debug.cc:1800
    #4 0x5585f6824ea0 in srv_general_init() /mnt/hgfs/repos/mysql-server/storage/innobase/srv/srv0srv.cc:1085
    #5 0x5585f6827b39 in srv_boot() /mnt/hgfs/repos/mysql-server/storage/innobase/srv/srv0srv.cc:1126
    #6 0x5585f68594d2 in innobase_start_or_create_for_mysql() /mnt/hgfs/repos/mysql-server/storage/innobase/srv/srv0start.cc:1715
    #7 0x5585f6409fb7 in innobase_init /mnt/hgfs/repos/mysql-server/storage/innobase/handler/ha_innodb.cc:4056
    #8 0x5585f47ae36f in ha_initialize_handlerton(st_plugin_int*) /mnt/hgfs/repos/mysql-server/sql/handler.cc:840
    #9 0x5585f5b20c50 in plugin_initialize /mnt/hgfs/repos/mysql-server/sql/sql_plugin.cc:1225
    #10 0x5585f5b25eee in plugin_register_builtin_and_init_core_se(int*, char**) /mnt/hgfs/repos/mysql-server/sql/sql_plugin.cc:1588
    #11 0x5585f464ea67 in init_server_components /mnt/hgfs/repos/mysql-server/sql/mysqld.cc:4074
    #12 0x5585f46520cb in mysqld_main(int, char**) /mnt/hgfs/repos/mysql-server/sql/mysqld.cc:4768
    #13 0x5585f46390c2 in main /mnt/hgfs/repos/mysql-server/sql/main.cc:25
    #14 0x7ff093968b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

How to repeat:
Build MySQL Server with Address Sanitizer enabled on Ubuntu 18.04 with default GCC (7.3.0)

cmake ... -DWITH_ASAN=ON -DWITH_ASAN_SCOPE=ON

Run
./mtr --debug-server innodb.log_file_name

Output
mysqltest: At line 39: command "$mysqld" failed with wrong error: 42

Look at the content of the 'tmp/my_restart.err'

Suggested fix:
'sync_latch_meta_destroy()' should also be called in some error return paths.
[12 Nov 2018 15:13] Yura Sorokin
Observed in MySQL Server 5.7.23. 8.0 is most probably affected as well.
[13 Nov 2018 9:12] MySQL Verification Team
Hello Yura,

Thank you for the report and contribution.
I'm seeing this issue on 5.7.23 but no longer with 5.7.24. Could you please confirm if you are seeing this on 5.7.24? In that case may I request to please provide exact cmake options used for the build? I'm joining build and test results from my environment shortly for your reference.

regards,
Umesh
[13 Nov 2018 9:12] MySQL Verification Team
test results

Attachment: 93164_93165.results (application/octet-stream, text), 10.71 KiB.

[13 Nov 2018 14:59] MySQL Verification Team
Thank you for the feedback!

regards,
Umesh
[26 Nov 2018 15:53] Tor Didriksen
Posted by developer:
 
The test explicitly injects internal fatal innodb errors, which will make innodb terminate the server (with e.g. exit(3))
There is no point in trying to do a clean shutdown in such cases.
[26 Nov 2018 21:02] Yura Sorokin
Sorry, Tor, I disagree here.

The point here is to still be able to detect memory leaks other than those you call "expected".

In commit https://github.com/mysql/mysql-server/commit/e93e8db42d89154b37f63772ce68c1efda637609
you literally made 14 MTR test cases ignore ALL memory problems detected by ASan, not only those which you consider 'OK' when you terminate the process with the call to 'exit()'. In other words, new memory leaks introduced in FUTURE commits may not be detected because of those changes. Address Sanitizer is a very powerful tool and its coverage should be constantly extending rather than shrinking.

Therefore, I still believe that it's better to fix these errors with proper resource cleanup (calling 'sync_latch_meta_destroy()') even in process termination error paths.
However, if this fix introduces significant delay in process exit, these cleanups can only be added for ASan builds.

#if defined(__SANITIZE_ADDRESS__)
sync_latch_meta_destroy();
#endif