Bug #100491 Redo log processing fails when it shouldn't
Submitted: 11 Aug 2020 9:09 Modified: 14 Aug 2020 8:07
Reporter: Robert Golebiowski Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Security: Encryption Severity:S3 (Non-critical)
Version:8.0.20, 8.0.21 OS:Any
Assigned to: CPU Architecture:Any

[11 Aug 2020 9:09] Robert Golebiowski
Description:
When encryption information from redo log is processed in function fil_tablespace_redo_encryption - there is a check if the correct key is available in the keyring. The problem is that fil_tablespace_redo_encryption is called during redo log processing, even when variable recv_needed_recovery is false. Which means that redo log does not need to be applied, as changes that this redo log contains were already flushed to disk. Although, the changes were already applied, missing key will cause redo log processing to fail the server startup with error:
2020-08-11T08:39:15.731482Z 1 [Note] [MY-012552] [InnoDB] Starting crash recovery.
2020-08-11T08:39:15.790730Z 1 [ERROR] [MY-012657] [InnoDB] Encryption can't find master key, please check the keyring plugin is loaded.
2020-08-11T08:39:15.790888Z 1 [Warning] [MY-012189] [InnoDB] Encryption information in the redo log of space 4 is invalid
2020-08-11T08:39:15.791036Z 1 [ERROR] [MY-012519] [InnoDB] ############### CORRUPT LOG RECORD FOUND ###############

How to repeat:
Setup the test so the encryption information was in redo and was processed by the server startup - even though the changes were already flushed to disk. Restart the server without the keyring - observe the redo log processing error and failure to start up the server.

Suggested fix:
Do not error out when recv_needed_recovery is false.
[11 Aug 2020 9:39] Robert Golebiowski
--replace_result $MYSQLTEST_VARDIR MYSQLTEST_VARDIR $KEYRING_PLUGIN KEYRING_PLUGIN
--let $restart_parameters=restart:--keyring-file-data=$MYSQLTEST_VARDIR/my_keyring --early-plugin-load="keyring_file=$KEYRING_PLUGIN" --default-table-encryption=OFF
--source include/restart_mysqld.inc

--source include/no_checkpoint_start.inc

create table t1(a int not null primary key, c char(200), b blob, index(b(10))) ENCRYPTION='Y';

begin;
--disable_query_log
--let $i = 20
begin;
while ($i)
{
  eval insert into t1(a,c,b) values ($i, repeat('secret1',20), repeat('secret2',6000));
  dec $i;
}
--enable_query_log

commit;

# If checkpoint happens we will skip this test.
# If no checkpoint  happens, InnoDB refuses to
# start as used  encryption key is not found.

SET GLOBAL innodb_flush_log_at_trx_commit=1;
begin;
update t1 set c = repeat('secret3', 20);
insert into t1 (a,c,b) values (21, repeat('secret5',20), repeat('secret6',6000));
COMMIT;

let $cleanup= drop table t1;
--let CLEANUP_IF_CHECKPOINT= $cleanup;
--source include/no_checkpoint_end.inc

--echo # restart

--let $error_log=$MYSQLTEST_VARDIR/tmp/my_restart.err

--error 1
--exec $MYSQLD_CMD --log-error=$error_log

check my_restart.err for error
[11 Aug 2020 11:04] MySQL Verification Team
Hello Robert,

Thank you for the report and test case.

Thanks,
Umesh
[14 Aug 2020 6:23] Erlend Dahl
[12 Aug 2020 3:38] Mayank Prasad

This doesn't look like a bug to me. Reporter has created an ENCRYPTED table
and trying to start the server without KEYRING plugin. This is not correct/recommended course of action.
 
We have mentioned clearly in our documentation :
 
https://dev.mysql.com/doc/refman/8.0/en/innodb-data-encryption.html#innodb-data-encryption...
 
"The keyring plugin must be loaded prior to storage engine initialization so that the information necessary to decrypt tablespace data pages can be retrieved from tablespace headers before InnoDB initialization and recovery activities access tablespace data. (See Encryption Prerequisites.)"
[14 Aug 2020 8:07] Robert Golebiowski
Hey Mayank,

I understand your point of view. However, in this situation, although the redo log does not need applying - the check for keyring is done anyways and is causing redo log scan to abort the server startup. In this case (the redo log does not need applying), it is enough that server would mark the tablespace as missing. The redo scan would allow for missing keys and later the validation step of encrypted tables would mark the tablespace as missing (as it would discover that the encryption key is missing).