Bug #108952 | data inconsistent since replica crash recovery after undo tablespace truncation | ||
---|---|---|---|
Submitted: | 1 Nov 2022 12:13 | Modified: | 4 Nov 2022 15:20 |
Reporter: | John White | Email Updates: | |
Status: | Unsupported | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 8.0.28 | OS: | CentOS (7.2) |
Assigned to: | CPU Architecture: | x86 | |
Tags: | crash recovery, undo truncate |
[1 Nov 2022 12:13]
John White
[1 Nov 2022 14:30]
John White
Connection 1: > SET GLOBAL super_read_only = ON; > ROLLBACK; fix reproduce step, set super_read_only then rollback, wait for undo truncation.
[2 Nov 2022 12:53]
MySQL Verification Team
Hi Mr. White, Thank you for your bug report. However, we do not consider reports where one of the step is killing MySQL server unconditionally. We can accept that too, providing that all of your ACID settings are set to maximum, where syncing the drives is set at the most severe level and that all these files are located to non-cached partitions. Also, your workarounds are quite acceptable for installations without these conditions. Unsupported.
[3 Nov 2022 10:51]
John White
Maybe above reproducing steps mislead the primal problem. Our settings of MySQL as following, sync_binlog & innodb_flush_log_at_trx_commit both set to 1; MySQL was killed by OOM killer at the first time I encountered the case, the key to the question is undo_{space_num}_trunc.log left after undo truncated when set state and id failed in DD. Maybe this error can be catched and processed, instead of regarding trunc.log as unfinished undo truncation when MySQL shutdown & boot abnormally. [mysqld] port = 3306 socket = /data/mysql/data/mysql.sock back_log = 300 max_connections = 3500 max_connect_errors = 9999 transaction_isolation = READ-COMMITTED max_allowed_packet = 1024M open_files_limit = 10000 character_set_server = latin1 performance_schema = ON performance_schema_instrument = 'memory/%=COUNTED' performance_schema_digests_size = 40000 performance_schema_max_table_instances = 40000 performance_schema_max_sql_text_length = 4096 performance_schema_max_digest_length = 4096 performance_schema_max_table_handles = 10000 log_bin = binlog binlog_format = row log_slave_updates = on sync_binlog = 1 slow_query_log long_query_time = 2 slow_query_log_file = mysql-slow.log log_slow_slave_statements binlog_cache_size = 2M sort_buffer_size = 2M join_buffer_size = 2M key_buffer_size = 32M read_buffer_size = 2M read_rnd_buffer_size = 16M bulk_insert_buffer_size = 64M thread_stack = 192K innodb_buffer_pool_size = 80G innodb_flush_log_at_trx_commit = 1 innodb_log_buffer_size = 16M innodb_log_file_size = 16M innodb_log_files_in_group = 3 innodb_max_dirty_pages_pct = 60 innodb_stats_on_metadata = 0 innodb_flush_method = O_DIRECT innodb_flush_neighbors = 0 innodb_lock_wait_timeout = 8 thread_cache_size = 256 innodb_write_io_threads = 16 innodb_read_io_threads = 8 innodb_io_capacity = 200 innodb_thread_concurrency = 8 innodb_purge_threads = 1 innodb_buffer_pool_instances = 8 innodb_autoinc_lock_mode = 2 innodb_page_cleaners = 4 innodb_old_blocks_time = 1000 innodb_buffer_pool_dump_at_shutdown = 0 innodb_buffer_pool_load_at_startup = 0 innodb_print_all_deadlocks = 1 innodb_log_checksums = on innodb_checksum_algorithm = crc32 innodb_file_per_table = 1 innodb_adaptive_hash_index = 0 datadir = /data/mysql/data innodb_log_group_home_dir = /data/mysql/data innodb_data_home_dir = /data/mysql/data innodb_data_file_path = ibdata1:16M:autoextend general_log_file = mysql.log log_error = mysql.err pid_file = mysql.pid table_definition_cache = 10000 table_open_cache = 10000 innodb_open_files = 10000 table_open_cache_instances = 64 gtid_mode = ON enforce_gtid_consistency = 1 binlog_gtid_simple_recovery = 1 binlog_rows_query_log_events = 1 server_id = 1 relay_log_space_limit = 10G skip_slave_start relay_log_recovery = ON master_info_repository = TABLE relay_log_info_repository = TABLE slave_parallel_type = logical_clock slave_parallel_workers = 8 slave_preserve_commit_order = 0 slave_net_timeout = 30 interactive_timeout = 1800 wait_timeout = 1800 log_error_verbosity = 3 log_timestamps = SYSTEM loose_rpl_semi_sync_master_wait_point = AFTER_SYNC
[3 Nov 2022 14:04]
MySQL Verification Team
Hi Mr. White, Thank you for your feedback. It is clear for your feedback that you have not set all variables to 100 % ACID compliance, nor have you managed to prevent your disk and its partitions from being cached by Linux.
[4 Nov 2022 3:51]
John White
Thanks for your advice, may I get a guidance how to set all variables to 100% ACID compliance and prevent disk and its partitions from being cached by Linux? If Linux didn't reboot, cached by Linux and disk still there. According to https://dev.mysql.com/doc/refman/8.0/en/replication-solutions-unexpected-replica-halt.html IMHO, sync_binlog & innodb_flush_log_at_trx_commit set to 1 is enough to protect data from crash, innodb_flush_method set to O_DIRECT could bypass Linux cache, sync_relay_log was the default 10000 but relay_log_recovery = ON, I think it's safe If any other settings could prevent MySQL, may I get a list and check them?
[4 Nov 2022 12:45]
MySQL Verification Team
Hi Mr. White, Unconditional killing of MySQL has the same consequences as the crash of the OS. Hence, first thing to make sure is that cacheing of disks, partitions and filesystem is off. Regarding all other ACID settings, read our Reference Manual, very carefully. Type of partition and filesystem syncing should be set carefully and precisely and should be checked whether is working on your hardware and OS.
[4 Nov 2022 15:20]
John White
Thanks for your advice, I had consulted our OS experts for the settings, and verified the environment of disk & OS works, we deployed thousands of MySQL on the same standard environment for at least 5 years, it have been always safe even encounter electric power failure. There still one question confuses me, why didn't empty undo_1_trunc.log remove after undo truncation? Would this log lead undo_001 cleared again when MySQL restart? As I traced MySQL startup, I saw following stacktrace, if magic_no != undo::s_magic, undo_001 will be recreated, since at this time, undo_1_trunc.log is 16KiB empty file, no magic number written due to latest undo truncation failure as above description, all the time OS & disk works fine, I'm sure the reason undo_1_trunc.log still be there is not a OS or disk failure, because I traced that setting super_read_only bring on the problem, thanks for reading my trivial description: undo::is_active_truncate_log_present(space_num) srv_undo_tablespace_fixup(const char * space_name, const char * file_name, space_id_t space_id) (/data/compile/mysql-8.0.28/storage/innobase/srv/srv0start.cc:807) Validate_files::check(Validate_files * const this, const Validate_files::Const_iter & begin, const Validate_files::Const_iter & end, size_t thread_id) (/data/compile/mysql-8.0.28/storage/innobase/handler/ha_innodb.cc:3406) Validate_files::validate(Validate_files * const this, const Validate_files::DD_tablespaces & tablespaces) (/data/compile/mysql-8.0.28/storage/innobase/handler/ha_innodb.cc:3644) boot_tablespaces(THD * thd) (/data/compile/mysql-8.0.28/storage/innobase/handler/ha_innodb.cc:3718) innobase_dict_recover(dict_recovery_mode_t dict_recovery_mode, uint version) (/data/compile/mysql-8.0.28/storage/innobase/handler/ha_innodb.cc:3933) (anonymous namespace)::DDSE_dict_recover(THD * thd, dict_recovery_mode_t dict_recovery_mode, uint version) (/data/compile/mysql-8.0.28/sql/dd/impl/bootstrap/bootstrapper.cc:87) dd::bootstrap::restart(THD * thd) (/data/compile/mysql-8.0.28/sql/dd/impl/bootstrap/bootstrapper.cc:933) dd::upgrade_57::restart_dictionary(THD * thd) (/data/compile/mysql-8.0.28/sql/dd/upgrade_57/upgrade.cc:797) dd::upgrade_57::do_pre_checks_and_initialize_dd(THD * thd) (/data/compile/mysql-8.0.28/sql/dd/upgrade_57/upgrade.cc:957) bootstrap::handle_bootstrap(void * arg) (/data/compile/mysql-8.0.28/sql/bootstrap.cc:327) pfs_spawn_thread(void * arg) (/data/compile/mysql-8.0.28/storage/perfschema/pfs.cc:2947)
[4 Nov 2022 16:34]
huahua xu
Hi, John White You are right that undo tablespace contains data need to be rollbacked would be truncated while booting tablespaces during InnoDB recovery, and result in inconsistent or missing data. I maybe reproduce your problem and hope it works for you: step1: produce a xa transaction, in session1 mysql> select * from tmp.test; +----+-----------+ | id | newColumn | +----+-----------+ | 1 | 100 | | 2 | 100 | +----+-----------+ 2 rows in set (0.00 sec) mysql> xa start 'test'; Query OK, 0 rows affected (0.00 sec) mysql> update test set newColumn = 200; Query OK, 2 rows affected (0.00 sec) Rows matched: 2 Changed: 2 Warnings: 0 mysql> xa end 'test'; Query OK, 0 rows affected (0.00 sec) mysql> xa prepare 'test'; Query OK, 0 rows affected (0.00 sec) step2: flush the dirty pages of table `test`, in session2 mysql> use tmp; Database changed mysql> flush table test; Query OK, 0 rows affected (0.00 sec) step3: kill the mysqld instance step4: I create tow empty files directly under the data directory, and they are undo_1_trunc.log and undo_2_trunc.log(It is very difficult to know which tablespace file the xa transaction's undo log is in) step5: reboot the mysqld instance step6: verify the status of the xa transaction mysql> xa recover; Empty set (0.00 sec) mysql> select * from tmp.test; +----+-----------+ | id | newColumn | +----+-----------+ | 1 | 200 | | 2 | 200 | +----+-----------+ 2 rows in set (0.03 sec) I don't think it is a correct result about the status of the xa transaction, and the data of undo log is missed.