Bug #78999 Injected query log events do not set explicit_defaults_ts, read it on applying
Submitted: 28 Oct 2015 10:58 Modified: 18 Mar 2016 18:21
Reporter: Laurynas Biveinis (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.7.9 OS:Any
Assigned to: CPU Architecture:Any
Tags: GTID, MTS, replication, valgrind

[28 Oct 2015 10:58] Laurynas Biveinis
Description:
As a fix for bug 72794 (and bug 73066?), slave may inject BEGIN and ROLLBACK to rollback workers (coord_handle_partial_binlogged_transaction in rpl_slave.cc). These events are created using constructors which do not initialise binary_log::Query_event::explicit_defaults_ts. But these events will be applied through Query_log_event::do_apply_event, which will read this:

  if (explicit_defaults_ts != TERNARY_UNSET)
    thd->variables.explicit_defaults_for_timestamp=
      explicit_defaults_ts == TERNARY_OFF? 0 : 1;

How to repeat:
cmake ... -DBUILD_CONFIG=mysql_release -DWITH_DEBUG=ON -DWITH_VALGRIND=ON -DWITH_SSL=system -DWITH_BOOST=...
...
./mtr --parallel=3 --debug-server --valgrind-mysqld rpl_mts_gtids_restart_slave_io_lost_trx rpl_mts_slave_hang_with_partial_trx --mysqld=--enforce-gtid-consistency --mysqld=--log-slave-updates --mysqld=--gtid-mode=on
...
rpl.rpl_mts_gtids_restart_slave_io_lost_trx 'stmt' w1 [ fail ]  Found warnings/errors in server log file!
        Test ended at 2015-10-28 12:54:05
line
==19858== Thread 28:
==19858== Conditional jump or move depends on uninitialised value(s)
==19858==    at 0x17EB013: Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned long) (log_event.cc:4397)
==19858==    by 0x17EABEE: Query_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:4312)
==19858==    by 0x180F05D: Log_event::do_apply_event_worker(Slave_worker*) (log_event.cc:756)
==19858==    by 0x18887FA: Slave_worker::slave_worker_exec_event(Log_event*) (rpl_rli_pdb.cc:1892)
==19858==    by 0x188A98E: slave_worker_exec_job_group(Slave_worker*, Relay_log_info*) (rpl_rli_pdb.cc:2592)
==19858==    by 0x1860B96: handle_slave_worker (rpl_slave.cc:5932)
==19858==    by 0x1D738C4: pfs_spawn_thread (pfs.cc:2192)
==19858==    by 0x5D376A9: start_thread (pthread_create.c:333)
==19858==    by 0x68F5EEC: clone (clone.S:109)
...
rpl.rpl_mts_slave_hang_with_partial_trx 'row' w3 [ fail ]  Found warnings/errors in server log file!
        Test ended at 2015-10-28 12:54:06
line
==19859== Thread 29:
==19859== Conditional jump or move depends on uninitialised value(s)
==19859==    at 0x17EB013: Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned long) (log_event.cc:4397)
==19859==    by 0x17EABEE: Query_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:4312)
==19859==    by 0x180F05D: Log_event::do_apply_event_worker(Slave_worker*) (log_event.cc:756)
==19859==    by 0x18887FA: Slave_worker::slave_worker_exec_event(Log_event*) (rpl_rli_pdb.cc:1892)
==19859==    by 0x188A98E: slave_worker_exec_job_group(Slave_worker*, Relay_log_info*) (rpl_rli_pdb.cc:2592)
==19859==    by 0x1860B96: handle_slave_worker (rpl_slave.cc:5932)
==19859==    by 0x1D738C4: pfs_spawn_thread (pfs.cc:2192)
==19859==    by 0x5D376A9: start_thread (pthread_create.c:333)
==19859==    by 0x68F5EEC: clone (clone.S:109)

Suggested fix:
The simplest fix seems to be adding explicit_default_ts(TERNARY_UNSET) to binary_log::Query_event::Query_event(const char *query_arg, ...)
[28 Oct 2015 11:24] MySQL Verification Team
Hello Laurynas,

Thank you for the report.

Thanks,
Umesh
[28 Oct 2015 11:26] MySQL Verification Team
// 5.7.9

[umshastr@hod03]/export/umesh/server/source/bugs/mysql-5.7.9/mysql-test: ./mtr --parallel=3 --debug-server --valgrind-mysqld rpl_mts_gtids_restart_slave_io_lost_trx rpl_mts_slave_hang_with_partial_trx --mysqld=--enforce-gtid-consistency --mysqld=--log-slave-updates --mysqld=--gtid-mode=on
.
.
rpl.rpl_mts_slave_hang_with_partial_trx 'row' w1 [ fail ]  Found warnings/errors in server log file!
        Test ended at 2015-10-28 12:23:15
line
==13392== Thread 29:
==13392== Conditional jump or move depends on uninitialised value(s)
==13392==    at 0x16F7620: Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned long) (log_event.cc:4397)
==13392==    by 0x16F7238: Query_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:4312)
==13392==    by 0x17197C7: Log_event::do_apply_event_worker(Slave_worker*) (log_event.cc:756)
==13392==    by 0x178DA3F: Slave_worker::slave_worker_exec_event(Log_event*) (rpl_rli_pdb.cc:1892)
==13392==    by 0x178FA10: slave_worker_exec_job_group(Slave_worker*, Relay_log_info*) (rpl_rli_pdb.cc:2592)
==13392==    by 0x1767501: handle_slave_worker (rpl_slave.cc:5932)
==13392==    by 0x1C4BF0B: pfs_spawn_thread (pfs.cc:2192)
==13392==    by 0x4E3CDF4: start_thread (in /usr/lib64/libpthread-2.17.so)
==13392==    by 0x6A0B60C: clone (in /usr/lib64/libc-2.17.so)
^ Found warnings in /export/umesh/server/source/bugs/mysql-5.7.9/mysql-test/var/1/log/mysqld.2.err
ok

 - the logfile can be found in '/export/umesh/server/source/bugs/mysql-5.7.9/mysql-test/var/log/rpl.rpl_mts_slave_hang_with_partial_trx-row/rpl_mts_slave_hang_with_partial_trx.log'
worker[1] Valgrind report from /export/umesh/server/source/bugs/mysql-5.7.9/mysql-test/var/1/log/mysqld.2.err after tests:
 rpl.rpl_mts_slave_hang_with_partial_trx
--------------------------------------------------------------------------
HEAP SUMMARY:
    in use at exit: 64 bytes in 2 blocks
  total heap usage: 118,629 allocs, 118,627 frees, 402,354,399 bytes allocated

Thread 1:
32 bytes in 1 blocks are still reachable in loss record 1 of 2
   at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x596C2D2: CRYPTO_malloc (in /usr/lib64/libcrypto.so.1.0.1e)
   by 0x59E4B2E: sk_new (in /usr/lib64/libcrypto.so.1.0.1e)
   by 0x56DE879: ??? (in /usr/lib64/libssl.so.1.0.1e)
   by 0x56E0828: SSL_COMP_get_compression_methods (in /usr/lib64/libssl.so.1.0.1e)
   by 0x56E61C8: SSL_library_init (in /usr/lib64/libssl.so.1.0.1e)
   by 0x1CF8801: ssl_start (viosslfactories.c:367)
   by 0xE1187E: init_ssl() (mysqld.cc:3323)
   by 0xE14630: mysqld_main(int, char**) (mysqld.cc:4725)
   by 0xE0BE9F: main (main.cc:25)

32 bytes in 1 blocks are still reachable in loss record 2 of 2
   at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x596C2D2: CRYPTO_malloc (in /usr/lib64/libcrypto.so.1.0.1e)
   by 0x59E4B4C: sk_new (in /usr/lib64/libcrypto.so.1.0.1e)
   by 0x56DE879: ??? (in /usr/lib64/libssl.so.1.0.1e)
   by 0x56E0828: SSL_COMP_get_compression_methods (in /usr/lib64/libssl.so.1.0.1e)
   by 0x56E61C8: SSL_library_init (in /usr/lib64/libssl.so.1.0.1e)
   by 0x1CF8801: ssl_start (viosslfactories.c:367)
   by 0xE1187E: init_ssl() (mysqld.cc:3323)
   by 0xE14630: mysqld_main(int, char**) (mysqld.cc:4725)
   by 0xE0BE9F: main (main.cc:25)

LEAK SUMMARY:
   definitely lost: 0 bytes in 0 blocks
   indirectly lost: 0 bytes in 0 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 64 bytes in 2 blocks
        suppressed: 0 bytes in 0 blocks

For counts of detected and suppressed errors, rerun with: -v
Use --track-origins=yes to see where uninitialised values come from
ERROR SUMMARY: 86 errors from 41 contexts (suppressed: 2 from 2)
safe_process[13389]: Child process: 13392, exit: 42

rpl.rpl_mts_gtids_restart_slave_io_lost_trx 'stmt' w3 [ fail ]  Found warnings/errors in server log file!
        Test ended at 2015-10-28 12:23:21
line
==13391== Thread 28:
==13391== Conditional jump or move depends on uninitialised value(s)
==13391==    at 0x16F7620: Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned long) (log_event.cc:4397)
==13391==    by 0x16F7238: Query_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:4312)
==13391==    by 0x17197C7: Log_event::do_apply_event_worker(Slave_worker*) (log_event.cc:756)
==13391==    by 0x178DA3F: Slave_worker::slave_worker_exec_event(Log_event*) (rpl_rli_pdb.cc:1892)
==13391==    by 0x178FA10: slave_worker_exec_job_group(Slave_worker*, Relay_log_info*) (rpl_rli_pdb.cc:2592)
==13391==    by 0x1767501: handle_slave_worker (rpl_slave.cc:5932)
==13391==    by 0x1C4BF0B: pfs_spawn_thread (pfs.cc:2192)
==13391==    by 0x4E3CDF4: start_thread (in /usr/lib64/libpthread-2.17.so)
==13391==    by 0x6A0B60C: clone (in /usr/lib64/libc-2.17.so)
^ Found warnings in /export/umesh/server/source/bugs/mysql-5.7.9/mysql-test/var/3/log/mysqld.2.err
ok

 - the logfile can be found in '/export/umesh/server/source/bugs/mysql-5.7.9/mysql-test/var/log/rpl.rpl_mts_gtids_restart_slave_io_lost_trx-stmt/rpl_mts_gtids_restart_slave_io_lost_trx.log'
worker[3] Valgrind report from /export/umesh/server/source/bugs/mysql-5.7.9/mysql-test/var/3/log/mysqld.1.err after tests:
 rpl.rpl_mts_gtids_restart_slave_io_lost_trx
--------------------------------------------------------------------------
HEAP SUMMARY:
    in use at exit: 64 bytes in 2 blocks
  total heap usage: 112,548 allocs, 112,546 frees, 405,050,273 bytes allocated

Thread 1:
32 bytes in 1 blocks are still reachable in loss record 1 of 2
   at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x596C2D2: CRYPTO_malloc (in /usr/lib64/libcrypto.so.1.0.1e)
   by 0x59E4B2E: sk_new (in /usr/lib64/libcrypto.so.1.0.1e)
   by 0x56DE879: ??? (in /usr/lib64/libssl.so.1.0.1e)
   by 0x56E0828: SSL_COMP_get_compression_methods (in /usr/lib64/libssl.so.1.0.1e)
   by 0x56E61C8: SSL_library_init (in /usr/lib64/libssl.so.1.0.1e)
   by 0x1CF8801: ssl_start (viosslfactories.c:367)
   by 0xE1187E: init_ssl() (mysqld.cc:3323)
   by 0xE14630: mysqld_main(int, char**) (mysqld.cc:4725)
   by 0xE0BE9F: main (main.cc:25)

32 bytes in 1 blocks are still reachable in loss record 2 of 2
   at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x596C2D2: CRYPTO_malloc (in /usr/lib64/libcrypto.so.1.0.1e)
   by 0x59E4B4C: sk_new (in /usr/lib64/libcrypto.so.1.0.1e)
   by 0x56DE879: ??? (in /usr/lib64/libssl.so.1.0.1e)
   by 0x56E0828: SSL_COMP_get_compression_methods (in /usr/lib64/libssl.so.1.0.1e)
   by 0x56E61C8: SSL_library_init (in /usr/lib64/libssl.so.1.0.1e)
   by 0x1CF8801: ssl_start (viosslfactories.c:367)
   by 0xE1187E: init_ssl() (mysqld.cc:3323)
   by 0xE14630: mysqld_main(int, char**) (mysqld.cc:4725)
   by 0xE0BE9F: main (main.cc:25)

LEAK SUMMARY:
   definitely lost: 0 bytes in 0 blocks
   indirectly lost: 0 bytes in 0 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 64 bytes in 2 blocks
        suppressed: 0 bytes in 0 blocks

For counts of detected and suppressed errors, rerun with: -v
Use --track-origins=yes to see where uninitialised values come from
ERROR SUMMARY: 83 errors from 38 contexts (suppressed: 2 from 2)
safe_process[13385]: Child process: 13388, exit: 42

valgrind_report                          w0 [ fail ]
        Test ended at 2015-10-28 12:23:21

Valgrind reported failures at shutdown, see above

--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 73 seconds executing testcases

Completed: Failed 3/3 tests, 0.00% were successful.
[18 Mar 2016 18:21] Daniel So
Added the following entry to the MySQL 5.7.13 and 5.8.0 changelogs:

"The variable explicit_defaults_ts was not initialized during the construction of a Query_event object. That caused Valgrind warnings for dependency on an uninitialized variable. This fix makes sure the variable is initialized. 

References: This bug is a regression of Bug #18885916, Bug #72794."