Bug #116934 Recovery should abort immediatly if meet corrupt log
Submitted: 11 Dec 2024 2:04 Modified: 11 Dec 2024 5:30
Reporter: mengchu shi (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0,8.4 OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution, recovery crash, redo apply

[11 Dec 2024 2:04] mengchu shi
Description:
Bug #116933 (Function recv_single_rec doesn't handle corrupt_log correctly) is not the only bug about corrupt_log.

In function recv_scan_log_recs, if a corrupt_log is found in function recv_parse_log_recs, and this scan round doesn't reach the memory bottleneck, the redo scan will not stop. It will continue to do next scan round. However recv_sys->found_corrupt_log is true, so all the subsequent scan rounds are disable to call recv_parse_log_recs and recv_apply_hashed_log_recs, until the end of redo file. Finally, the crash recovery will abort with error log like:
```
2024-12-10T02:53:48.638461Z 1 [ERROR] [MY-012930] [InnoDB] Plugin initialization aborted at srv0start.cc[2041] with error Generic error.
```

I think all scan rounds, after the corrupt_log is found, is useless.

How to repeat:
Juse see the codes.
```c++
static bool recv_scan_log_recs(log_t &log,
                               size_t max_memory, const byte *buf, size_t len,
                               lsn_t start_lsn, lsn_t *read_upto_lsn) {
  ...
  if (more_data && !recv_sys->found_corrupt_log) {
    /* Try to parse more log records */

    recv_parse_log_recs();

#ifndef UNIV_HOTBACKUP
    if (recv_heap_used() > max_memory) {
      recv_apply_hashed_log_recs(log, false);
    }
#endif /* !UNIV_HOTBACKUP */
  }
  ...
}
```
```c++
dberr_t srv_start(bool create_new_db) {
  ...
  if (create_new_db) {
    ...
  } else {
    ...
    err = recv_recovery_from_checkpoint_start(*log_sys, flushed_lsn);
    ...
    if (srv_force_recovery < SRV_FORCE_NO_LOG_REDO) {
      ...
      recv_apply_hashed_log_recs(*log_sys,
                                 !recv_sys->is_cloned_db && !log_upgrade);

      if (recv_sys->found_corrupt_log) {
        err = DB_ERROR;
        return (srv_init_abort(err));
      }
      ...
    }
    ...
  }
  ...
}
```

Suggested fix:
Add !recv_sys->found_corrupt_log into the condition of redo scan loop.
[11 Dec 2024 2:05] mengchu shi
recovery_should_abort_if_meet_corrupt_log.patch

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: recovery_should_abort_if_meet_corrupt_log.patch (application/octet-stream, text), 511 bytes.

[11 Dec 2024 5:30] MySQL Verification Team
Hello mengchu shi,

Thank you for the report and contribution.

regards,
Umesh