Bug #78995 Binlog_sender::read_event may access freed event buffer
Submitted: 28 Oct 2015 8:19 Modified: 23 May 2016 14:57
Reporter: Laurynas Biveinis (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.7.9 OS:Any
Assigned to: CPU Architecture:Any
Tags: binlog dump, replication

[28 Oct 2015 8:19] Laurynas Biveinis
Description:
This shows up as an intermittent ASan error on rpl.rpl_checksum_cache:

==1930==ERROR: AddressSanitizer: heap-use-after-free on address 0x625000e24125 at pc 0x0001012d5f6a bp 0x700000cda190 sp 0x700000cda188
READ of size 1 at 0x625000e24125 thread T30
    #0 0x1012d5f69 in Binlog_sender::send_events(st_io_cache*, unsigned long long) rpl_binlog_sender.cc:394
    #1 0x1012d47e2 in Binlog_sender::send_binlog(st_io_cache*, unsigned long long) rpl_binlog_sender.cc:317
    #2 0x1012d3dd3 in Binlog_sender::run() rpl_binlog_sender.cc:225
    #3 0x1012cf886 in mysql_binlog_send(THD*, char*, unsigned long long, Gtid_set*, unsigned int) rpl_master.cc:415
    #4 0x1012cf472 in com_binlog_dump(THD*, char*, unsigned long) rpl_master.cc:342
    #5 0x100f6c18e in dispatch_command(THD*, COM_DATA const*, enum_server_command) sql_parse.cc:1501
    #6 0x100f6e725 in do_command(THD*) sql_parse.cc:852
    #7 0x1011a5dfb in handle_connection connection_handler_per_thread.cc:295
    #8 0x101aa5f70 in pfs_spawn_thread pfs.cc:2192
    #9 0x7fff996879b0 in _pthread_body (/usr/lib/system/libsystem_pthread.dylib+0x39b0)
    #10 0x7fff9968792d in _pthread_start (/usr/lib/system/libsystem_pthread.dylib+0x392d)
    #11 0x7fff99685384 in thread_start (/usr/lib/system/libsystem_pthread.dylib+0x1384)

0x625000e24125 is located 37 bytes inside of 8248-byte region [0x625000e24100,0x625000e26138)
freed by thread T30 here:
    #0 0x104ab2799 in wrap_free (/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib+0x43799)
    #1 0x101399699 in my_raw_free my_malloc.c:290
    #2 0x101399152 in my_realloc my_malloc.c:101
    #3 0x1011a00b1 in String::mem_realloc(unsigned long, bool) sql_string.cc:115
    #4 0x1011a212d in String::append(st_io_cache*, unsigned long) sql_string.cc:573
    #5 0x101222169 in Log_event::read_log_event(st_io_cache*, String*, st_mysql_mutex*, binary_log::enum_binlog_checksum_alg, char const*, bool*) log_event.cc:1183
    #6 0x1012d76fc in Binlog_sender::read_event(st_io_cache*, binary_log::enum_binlog_checksum_alg, unsigned char**, unsigned int*) rpl_binlog_sender.cc:1073
    #7 0x1012d5817 in Binlog_sender::send_events(st_io_cache*, unsigned long long) rpl_binlog_sender.cc:390
    #8 0x1012d47e2 in Binlog_sender::send_binlog(st_io_cache*, unsigned long long) rpl_binlog_sender.cc:317
    #9 0x1012d3dd3 in Binlog_sender::run() rpl_binlog_sender.cc:225
    #10 0x1012cf886 in mysql_binlog_send(THD*, char*, unsigned long long, Gtid_set*, unsigned int) rpl_master.cc:415
    #11 0x1012cf472 in com_binlog_dump(THD*, char*, unsigned long) rpl_master.cc:342
    #12 0x100f6c18e in dispatch_command(THD*, COM_DATA const*, enum_server_command) sql_parse.cc:1501
    #13 0x100f6e725 in do_command(THD*) sql_parse.cc:852
    #14 0x1011a5dfb in handle_connection connection_handler_per_thread.cc:295
    #15 0x101aa5f70 in pfs_spawn_thread pfs.cc:2192
    #16 0x7fff996879b0 in _pthread_body (/usr/lib/system/libsystem_pthread.dylib+0x39b0)
    #17 0x7fff9968792d in _pthread_start (/usr/lib/system/libsystem_pthread.dylib+0x392d)
    #18 0x7fff99685384 in thread_start (/usr/lib/system/libsystem_pthread.dylib+0x1384)

Looking at the code,

inline int Binlog_sender::read_event(IO_CACHE *log_cache, enum_binlog_checksum_alg checksum_alg,
                                     uchar **event_ptr, uint32 *event_len)
{
...
  *event_ptr= (uchar *)m_packet.ptr() + event_offset;
...
  /*
    packet is big enough to read the event, since we have reallocated based
    on the length stated in the event header.
  */
  if ((error= Log_event::read_log_event(log_cache, &m_packet, NULL, checksum_alg)))
    goto read_error;
...

event_ptr is saved but then m_packet can be reallocated, directly contradicting the comment above. I haven't analysed why this reallocation happens.

How to repeat:
cmake .. -DBUILD_CONFIG=mysql_release -DWITH_DEBUG=ON -DWITH_ASAN=ON -DWITH_BOOST=..
...
Bump --parallel --repeat if needed. So far I only saw it on 'row' variant of the testcase.
./mtr --debug-server rpl_checksum_cache --parallel=3 --repeat=50
...
rpl.rpl_checksum_cache 'row'             w2 [ fail ]
        Test ended at 2015-10-28 10:09:37

CURRENT_TEST: rpl.rpl_checksum_cache
safe_process[2425]: Child process: 2426, killed by signal: 9

Server [mysqld.1 - pid: 1926, winpid: 1926, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
2015-10-28T08:09:14.861583Z 127 [Note] Start binlog_dump to master_thread_id(127) slave_server(2), pos(, 4)
2015-10-28T08:09:15.369732Z 131 [Note] While initializing dump thread for slave with UUID <9e27cfc0-7d4a-11e5-9bb7-f824c9833581>, found a zombie dump thread with the same UUID. Master is killing the zombie dump thread.
2015-10-28T08:09:15.369939Z 131 [Note] Start binlog_dump to master_thread_id(131) slave_server(2), pos(master-bin.000002, 4)
=================================================================
==1930==ERROR: AddressSanitizer: heap-use-after-free on address 0x625000e24125 at pc 0x0001012d5f6a bp 0x700000cda190 sp 0x700000cda188
READ of size 1 at 0x625000e24125 thread T30
    #0 0x1012d5f69 in Binlog_sender::send_events(st_io_cache*, unsigned long long) rpl_binlog_sender.cc:394
    #1 0x1012d47e2 in Binlog_sender::send_binlog(st_io_cache*, unsigned long long) rpl_binlog_sender.cc:317
    #2 0x1012d3dd3 in Binlog_sender::run() rpl_binlog_sender.cc:225
    #3 0x1012cf886 in mysql_binlog_send(THD*, char*, unsigned long long, Gtid_set*, unsigned int) rpl_master.cc:415
    #4 0x1012cf472 in com_binlog_dump(THD*, char*, unsigned long) rpl_master.cc:342
    #5 0x100f6c18e in dispatch_command(THD*, COM_DATA const*, enum_server_command) sql_parse.cc:1501
    #6 0x100f6e725 in do_command(THD*) sql_parse.cc:852
    #7 0x1011a5dfb in handle_connection connection_handler_per_thread.cc:295
    #8 0x101aa5f70 in pfs_spawn_thread pfs.cc:2192
    #9 0x7fff996879b0 in _pthread_body (/usr/lib/system/libsystem_pthread.dylib+0x39b0)
    #10 0x7fff9968792d in _pthread_start (/usr/lib/system/libsystem_pthread.dylib+0x392d)
    #11 0x7fff99685384 in thread_start (/usr/lib/system/libsystem_pthread.dylib+0x1384)

0x625000e24125 is located 37 bytes inside of 8248-byte region [0x625000e24100,0x625000e26138)
freed by thread T30 here:
    #0 0x104ab2799 in wrap_free (/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib+0x43799)
    #1 0x101399699 in my_raw_free my_malloc.c:290
    #2 0x101399152 in my_realloc my_malloc.c:101
    #3 0x1011a00b1 in String::mem_realloc(unsigned long, bool) sql_string.cc:115
    #4 0x1011a212d in String::append(st_io_cache*, unsigned long) sql_string.cc:573
    #5 0x101222169 in Log_event::read_log_event(st_io_cache*, String*, st_mysql_mutex*, binary_log::enum_binlog_checksum_alg, char const*, bool*) log_event.cc:1183
    #6 0x1012d76fc in Binlog_sender::read_event(st_io_cache*, binary_log::enum_binlog_checksum_alg, unsigned char**, unsigned int*) rpl_binlog_sender.cc:1073
    #7 0x1012d5817 in Binlog_sender::send_events(st_io_cache*, unsigned long long) rpl_binlog_sender.cc:390
    #8 0x1012d47e2 in Binlog_sender::send_binlog(st_io_cache*, unsigned long long) rpl_binlog_sender.cc:317
    #9 0x1012d3dd3 in Binlog_sender::run() rpl_binlog_sender.cc:225
    #10 0x1012cf886 in mysql_binlog_send(THD*, char*, unsigned long long, Gtid_set*, unsigned int) rpl_master.cc:415
    #11 0x1012cf472 in com_binlog_dump(THD*, char*, unsigned long) rpl_master.cc:342
    #12 0x100f6c18e in dispatch_command(THD*, COM_DATA const*, enum_server_command) sql_parse.cc:1501
    #13 0x100f6e725 in do_command(THD*) sql_parse.cc:852
    #14 0x1011a5dfb in handle_connection connection_handler_per_thread.cc:295
    #15 0x101aa5f70 in pfs_spawn_thread pfs.cc:2192
    #16 0x7fff996879b0 in _pthread_body (/usr/lib/system/libsystem_pthread.dylib+0x39b0)
    #17 0x7fff9968792d in _pthread_start (/usr/lib/system/libsystem_pthread.dylib+0x392d)
    #18 0x7fff99685384 in thread_start (/usr/lib/system/libsystem_pthread.dylib+0x1384)

previously allocated by thread T30 here:
    #0 0x104ab25d0 in wrap_malloc (/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib+0x435d0)
    #1 0x101398d8b in my_raw_malloc my_malloc.c:191
    #2 0x101398b5b in my_malloc my_malloc.c:54
    #3 0x1013990af in my_realloc my_malloc.c:88
    #4 0x1011a00b1 in String::mem_realloc(unsigned long, bool) sql_string.cc:115
    #5 0x1012d9211 in Binlog_sender::grow_packet(unsigned long) rpl_binlog_sender.cc:1243
    #6 0x1012d8736 in Binlog_sender::reset_transmit_packet(unsigned short, unsigned long) rpl_binlog_sender.cc:913
    #7 0x1012d7623 in Binlog_sender::read_event(st_io_cache*, binary_log::enum_binlog_checksum_alg, unsigned char**, unsigned int*) rpl_binlog_sender.cc:1056
    #8 0x1012d5817 in Binlog_sender::send_events(st_io_cache*, unsigned long long) rpl_binlog_sender.cc:390
    #9 0x1012d47e2 in Binlog_sender::send_binlog(st_io_cache*, unsigned long long) rpl_binlog_sender.cc:317
    #10 0x1012d3dd3 in Binlog_sender::run() rpl_binlog_sender.cc:225
    #11 0x1012cf886 in mysql_binlog_send(THD*, char*, unsigned long long, Gtid_set*, unsigned int) rpl_master.cc:415
    #12 0x1012cf472 in com_binlog_dump(THD*, char*, unsigned long) rpl_master.cc:342
    #13 0x100f6c18e in dispatch_command(THD*, COM_DATA const*, enum_server_command) sql_parse.cc:1501
    #14 0x100f6e725 in do_command(THD*) sql_parse.cc:852
    #15 0x1011a5dfb in handle_connection connection_handler_per_thread.cc:295
    #16 0x101aa5f70 in pfs_spawn_thread pfs.cc:2192
    #17 0x7fff996879b0 in _pthread_body (/usr/lib/system/libsystem_pthread.dylib+0x39b0)
    #18 0x7fff9968792d in _pthread_start (/usr/lib/system/libsystem_pthread.dylib+0x392d)
    #19 0x7fff99685384 in thread_start (/usr/lib/system/libsystem_pthread.dylib+0x1384)

Thread T30 created by T0 here:
    #0 0x104aa78d9 in wrap_pthread_create (/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.0/lib/darwin/libclang_rt.asan_osx_dynamic.dylib+0x388d9)
    #1 0x101aa62d8 in pfs_spawn_thread_v1 pfs.cc:2243
    #2 0x1011a6914 in Per_thread_connection_handler::add_connection(Channel_info*) connection_handler_per_thread.cc:397
    #3 0x10051a4f6 in Connection_handler_manager::process_new_connection(Channel_info*) connection_handler_manager.cc:246
    #4 0x1011e635c in Connection_acceptor<Mysqld_socket_listener>::connection_event_loop() connection_acceptor.h:68
    #5 0x1011d7c5c in mysqld_main(int, char**) mysqld.cc:4895
    #6 0x7fff8f2de5ac in start (/usr/lib/system/libdyld.dylib+0x35ac)
    #7 0x7  (<unknown module>)

SUMMARY: AddressSanitizer: heap-use-after-free rpl_binlog_sender.cc:394 Binlog_sender::send_events(st_io_cache*, unsigned long long)
Shadow bytes around the buggy address:
  0x1c4a001c47d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c4a001c47e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c4a001c47f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c4a001c4800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x1c4a001c4810: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x1c4a001c4820: fd fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd
  0x1c4a001c4830: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c4a001c4840: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c4a001c4850: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c4a001c4860: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x1c4a001c4870: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==1930==ABORTING
08:09:30 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=6
max_threads=151
thread_count=6
connection_count=6
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 61129 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x628000158100
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 700000cdcdc0 thread_stack 0x40000
0   mysqld-debug                        0x00000001013a407e my_print_stacktrace + 222
1   mysqld-debug                        0x00000001011f1374 handle_fatal_signal + 1412
2   libsystem_platform.dylib            0x00007fff96e0152a _sigtramp + 26
3   ???                                 0x00007fff657ed482 0x0 + 140734896198786
4   libsystem_c.dylib                   0x00007fff999ea37b abort + 129
5   libclang_rt.asan_osx_dynamic.dylib  0x0000000104ac8676 _ZN11__sanitizer5AbortEv + 6
6   libclang_rt.asan_osx_dynamic.dylib  0x0000000104abadb7 __asan_report_load1 + 39
7   mysqld-debug                        0x00000001012d5f6a _ZN13Binlog_sender11send_eventsEP11st_io_cachey + 2634
8   mysqld-debug                        0x00000001012d47e3 _ZN13Binlog_sender11send_binlogEP11st_io_cachey + 691
9   mysqld-debug                        0x00000001012d3dd4 _ZN13Binlog_sender3runEv + 580
10  mysqld-debug                        0x00000001012cf887 _Z17mysql_binlog_sendP3THDPcyP8Gtid_setj + 263
11  mysqld-debug                        0x00000001012cf473 _Z15com_binlog_dumpP3THDPcm + 867
12  mysqld-debug                        0x0000000100f6c18f _Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command + 4895
13  mysqld-debug                        0x0000000100f6e726 _Z10do_commandP3THD + 1430
14  mysqld-debug                        0x00000001011a5dfc handle_connection + 524
15  mysqld-debug                        0x0000000101aa5f71 pfs_spawn_thread + 529
16  libsystem_pthread.dylib             0x00007fff996879b1 _pthread_body + 131
17  libsystem_pthread.dylib             0x00007fff9968792e _pthread_body + 0
18  libsystem_pthread.dylib             0x00007fff99685385 thread_start + 13

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 131
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
safe_process[1926]: Child process: 1930, killed by signal: 6
----------SERVER LOG END-------------
[28 Oct 2015 9:03] Umesh Shastry
Hello Laurynas,

Thank you for the report.

Thanks,
Umesh
[23 May 2016 14:57] David Moss
Posted by developer:
 
Thanks for your feedback. This has been fixed in upcoming versions and the following was added to the 5.7.13 changelog:
An intermittent ASan error was being reported on the rpl.rpl_checksum_cache test.  The error reported was related to the binary log sender doing a heap-use-after-free on a given memory address.