MySQL Bugs: #80219: rpl_stm_mixed_crash

Bug #80219	rpl_stm_mixed_crash_safe mtr test fails
Submitted:	1 Feb 2016 8:26	Modified:	18 Apr 2018 8:38
Reporter:	Vlad Lesin	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: Tests	Severity:	S3 (Non-critical)
Version:	5.6.28	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
rpl_stm_mixed_crash_safe mtr test fails with the following output:

CURRENT_TEST: rpl.rpl_stm_mixed_crash_safe
mysqltest: In included file "./extra/rpl_tests/rpl_crash_safe.test":
included from ./extra/rpl_tests/rpl_crash_safe.test at line 171:
At line 171: Server failed to restart

The result from queries just before the failure was:
< snip >
-e-e-e-e-e-e-e-e-e-e-e- >> T-trig << -e-e-e-e-e-e-e-e-e-e-e-
-b-b-b-b-b-b-b-b-b-b-b- >> C << -b-b-b-b-b-b-b-b-b-b-b-
COMMIT;
include/show_binlog_events.inc
Log_name        Pos     Event_type      Server_id       End_log_pos     Info
master-bin.000001       #       Query   #       #       BEGIN
master-bin.000001       #       Query   #       #       use `test`; INSERT INTO tt_1(trans_id, stmt_id) VALUES (41, 2)
master-bin.000001       #       Query   #       #       use `test`; INSERT INTO tt_5(trans_id, stmt_id) VALUES (41, 4)
master-bin.000001       #       Xid     #       #       COMMIT /* XID */
-e-e-e-e-e-e-e-e-e-e-e- >> C << -e-e-e-e-e-e-e-e-e-e-e-
-b-b-b-b-b-b-b-b-b-b-b- >> B T T-trig C << -b-b-b-b-b-b-b-b-b-b-b-
include/show_binlog_events.inc
Log_name        Pos     Event_type      Server_id       End_log_pos     Info
master-bin.000001       #       Query   #       #       BEGIN
master-bin.000001       #       Query   #       #       use `test`; INSERT INTO tt_1(trans_id, stmt_id) VALUES (41, 2)
master-bin.000001       #       Query   #       #       use `test`; INSERT INTO tt_5(trans_id, stmt_id) VALUES (41, 4)
master-bin.000001       #       Xid     #       #       COMMIT /* XID */
-e-e-e-e-e-e-e-e-e-e-e- >> B T T-trig C << -e-e-e-e-e-e-e-e-e-e-e-

include/rpl_reconnect.inc

More results from queries before failure can be found in /mysql-server/mysql-test/var-rpl_gtid-debug/1/log/rpl_stm_mixed_crash_safe.log

 - saving '/mysql-server/mysql-test/var-rpl_gtid-debug/1/log/rpl.rpl_stm_mixed_crash_safe-stmt/' to '/mysql-server/mysql-test/var-rpl_gtid-debug/log/rpl.rpl_stm_mixed_crash_safe-stmt/'

How to repeat:
Run this test with --parallel=n and --repeat=50 or more mtr options. The test will fail after random amount of iterations.

I could not repeat with current source running --parallel=4 --repeat=50 --big-test for 4 times. Any clue or additional instructions?. Thanks.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

I have seen an instance of this with Percona Server, here's the slave error log indicating transaction system page corruption:

rpl.rpl_stm_mixed_crash_safe 'stmt'      w2 [ fail ]
        Test ended at 2016-08-30 09:10:13

CURRENT_TEST: rpl.rpl_stm_mixed_crash_safe
safe_process[23364]: Child process: 23365, killed by signal: 9

Server [mysqld.2 - pid: 25227, winpid: 25227, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
...
2016-08-30 16:10:12 25116 [Note] Slave SQL thread initialized, starting replication in log 'master-bin.000001' at position 17801, relay log './slave-relay-bin.000033' position: 642
2016-08-30 16:10:12 25116 [Note] Crashing crash_before_update_pos.
SIGKILL myself
safe_process[25115]: Child process: 25116, killed by signal: 9
2016-08-30 16:10:13 0 [Warning] The syntax '--master-retry-count' is deprecated and will be removed in a future release. Please use 'CHANGE MASTER TO master_retry_count = <num>' instead.
2016-08-30 16:10:13 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2016-08-30 16:10:13 0 [Note] /mnt/workspace/percona-server-5.6-trunk/BUILD_TYPE/debug/Host/ubuntu-xenial-32bit/build/sql/mysqld-debug (mysqld 5.6.32-78.0-debug-log) starting as process 25228 ...
2016-08-30 16:10:13 25228 [Warning] The option innodb (skip-innodb) is deprecated and will be removed in a future release
2016-08-30 16:10:13 25228 [Note] Plugin 'FEDERATED' is disabled.
2016-08-30 16:10:13 b709a700 InnoDB: Warning: Using innodb_additional_mem_pool_size is DEPRECATED. This option may be removed in future releases, together with the option innodb_use_sys_malloc and with the InnoDB's internal memory allocator.
2016-08-30 16:10:13 25228 [Note] InnoDB: Using atomics to ref count buffer pool pages
2016-08-30 16:10:13 b709a700 InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
2016-08-30 16:10:13 b709a700 InnoDB: !!!!!!!! UNIV_SYNC_DEBUG switched on !!!!!!!!!
2016-08-30 16:10:13 25228 [Note] InnoDB: The InnoDB memory heap is disabled
2016-08-30 16:10:13 25228 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-08-30 16:10:13 25228 [Note] InnoDB: Memory barrier is not used
2016-08-30 16:10:13 25228 [Note] InnoDB: Compressed tables use zlib 1.2.8
2016-08-30 16:10:13 25228 [Note] InnoDB: Using Linux native AIO
2016-08-30 16:10:13 25228 [Note] InnoDB: Not using CPU crc32 instructions
2016-08-30 16:10:13 25228 [Note] InnoDB: Initializing buffer pool, size = 256.0M
2016-08-30 16:10:13 25228 [Note] InnoDB: Completed initialization of buffer pool
InnoDB: Database page corruption on disk or a failed
InnoDB: file read of page 5.
InnoDB: You may have to recover from a backup.
2016-08-30 16:10:13 b709a700 InnoDB: Page dump in ascii and hex (16384 bytes):
 len 16384; hex 
...
InnoDB: End of page dump
2016-08-30 16:10:13 b709a700 InnoDB: uncompressed page, stored checksum in field1 16426813, calculated checksums for field1: crc32 2085730914, innodb 1819894837, none 3735928559, stored checksum in field2 3430790156, calculated checksums for field2: crc32 2085730914, innodb 233944407, none 3735928559, page LSN 0 1765232, low 4 bytes of LSN at page end 1760468, page number (if stored to page already) 5, space id (if created with >= MySQL-4.1.1 and stored already) 0
InnoDB: Page may be a transaction system page
InnoDB: Database page corruption on disk or a failed
InnoDB: file read of page 5.
InnoDB: You may have to recover from a backup.
InnoDB: It is also possible that your operating
InnoDB: system has corrupted its own file cache
InnoDB: and rebooting your computer removes the
InnoDB: error.
InnoDB: If the corrupt page is an index page
InnoDB: you can also try to fix the corruption
InnoDB: by dumping, dropping, and reimporting
InnoDB: the corrupt table. You can use CHECK
InnoDB: TABLE to scan your table for corruption.
InnoDB: See also http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
InnoDB: Ending processing because of a corrupt database page.
2016-08-30 16:10:13 b709a700  InnoDB: Assertion failure in thread 3070863104 in file buf0buf.cc line 4480
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
13:10:13 UTC - mysqld got signal 6 ;

0:5 page dump from the error log

Attachment: 0.5.txt (text/plain), 48.12 KiB.

I couldn't repeat with most recent source server. Thanks.