Bug #99938 alter table fails and reports Incorrect key file for table t1
Submitted: 19 Jun 2020 14:31 Modified: 12 Feb 2021 14:04
Reporter: Jie Zhou Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: DDL Severity:S3 (Non-critical)
Version:5.7 OS:Any
Assigned to: CPU Architecture:Any

[19 Jun 2020 14:31] Jie Zhou
Description:
SQL: alter table t1 add column ...
ERROR HY000: Incorrect key file for table 't1'; try to repair it

The error log shows the following messages:
[ERROR] InnoDB: Encryption algorithm support missing: N
[ERROR] InnoDB: Unable to read temporary file for table test/t1

BTW, I found several similar bug reports. None of them was verified because it is hard to repeat.
https://bugs.mysql.com/bug.php?id=94971
https://bugs.mysql.com/bug.php?id=90816
https://bugs.mysql.com/bug.php?id=99821

How to repeat:
Here is a repeatable test case, just run it by mtr on the latest MySQL version 5.7.30.

# online_ddl_incorrect_key_bug.test
SET @old_innodb_online_alter_log_max_size := @@innodb_online_alter_log_max_size;
SET @old_max_allowed_packet := @@max_allowed_packet;
set global innodb_online_alter_log_max_size=1342177280000;
set global max_allowed_packet=1024*1024*1024;

create table t1(d1 longblob);

--connect (con1,localhost,root,,test,,)
insert into t1(d1) values (repeat(UNHEX('000f'), 1022*1022*88));
insert into t1(d1) values (repeat(UNHEX('000f'), 1022*1022*88));
insert into t1(d1) values (repeat(UNHEX('000f'), 1022*1022*88));
insert into t1(d1) values (repeat(UNHEX('000f'), 1022*1022*88));
insert into t1(d1) values (repeat(UNHEX('000f'), 1022*1022*88));

--send alter table t1 ADD COLUMN `d2` char(128) not null default '0' after d1;

--connect (con2,localhost,root,,test,,)
--disable_query_log
let $ddl_not_finish= 1;
while ($ddl_not_finish)
{
  insert into t1(d1) values (repeat(UNHEX('000f'), 1022*3));
  let $ddl_not_finish= query_get_value("select count(1) as ddl from information_schema.innodb_trx where trx_query like '%alter table%'", ddl, 1);
}
--enable_query_log

--connection con1
--error ER_NOT_KEYFILE
--reap

--connection default
drop table t1;
SET GLOBAL innodb_online_alter_log_max_size = @old_innodb_online_alter_log_max_size;
set global max_allowed_packet = @old_max_allowed_packet;

Suggested fix:
Look at the stack:
row_log_table_apply_ops 
 -> os_file_read_no_error_handling 
  -> os_file_read_page 
   -> os_file_pread -> ... 
    -> os_file_io_complete 
     -> Encryption::decrypt

In function Encryption::decrypt, Encryption::is_encrypted_page is used to check whether page is encrypted or not.
We can see the code:
ulint	page_type = mach_read_from_2(page + FIL_PAGE_TYPE);

Actually, I find row_log during online DDL records no FIL_PAGE_TYPE information. So I suppose page_type can be any random value.

Imagine that the data here happens to be FIL_PAGE_ENCRYPTED, which is x'000f'.
Then Encryption::decrypt returns error, and the online DDL will fail.

In my testcase, I let a blob field filled by '000f' and confirm the above statement.
[19 Jun 2020 15:15] MySQL Verification Team
Hi Mr. Zhou,

Thank you for your bug report.

We tried to repeat your test case, but without success.......... As it can be seen from the output:

-----------------------------------------------------------------------------------------------

MySQL Version 5.7.30
Checking supported features...
 - SSL connections supported
 - binaries are debug compiled
Collecting tests...
Removing old var directory...
Creating var directory ''..."
Installing system database...
Using parallel: 1

==============================================================================

TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
[100%] main.bug_99938                           [ fail ]
        Test ended at 2020-06-19 18:09:51

CURRENT_TEST: main.bug_99938
mysqltest: At line 29: query 'reap' succeeded - should have failed with errno 1034...

The result from queries just before the failure was:
SET @old_innodb_online_alter_log_max_size := @@innodb_online_alter_log_max_size;
SET @old_max_allowed_packet := @@max_allowed_packet;
set global innodb_online_alter_log_max_size = 1342177280000;
set global max_allowed_packet=1024*1024*1024;
create table t1(d1 longblob);
insert into t1(d1) values (repeat(UNHEX('000f'), 1022*1022*88));
insert into t1(d1) values (repeat(UNHEX('000f'), 1022*1022*88));
insert into t1(d1) values (repeat(UNHEX('000f'), 1022*1022*88));
insert into t1(d1) values (repeat(UNHEX('000f'), 1022*1022*88));
insert into t1(d1) values (repeat(UNHEX('000f'), 1022*1022*88));
alter table t1 ADD COLUMN `d2` char(128) not null default '0' after d1;;
safe_process[7274]: Child process: 7275, exit: 1

 - the logfile can be found in '............./mysql-5.7.30/mysql-test/var/log/main.bug_99938/bug_99938.log'

--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 53 seconds executing testcases

Completed: Failed 1/1 tests, 0.00% were successful.

Failing test(s): main.bug_99938

......................
[19 Jun 2020 15:43] Jie Zhou
It also depends on the performance of your machine.
We should make sure the DDL last long enough.

Please add more insert SQL as below before execute DDL in the testcase, and try again.
insert into t1(d1) values (repeat(UNHEX('000f'), 1022*1022*88));
[19 Jun 2020 15:55] Jie Zhou
If it still doesn't work. A better approach is to use sysbench to issue the insert SQL concurrently during DDL.
The goal is to make online log large enough.

Also, please take some time to read my analyze in Suggested fix section.
[22 Jun 2020 12:51] MySQL Verification Team
Hi Mr. Zhou,

I have tried adding hundreds of those statements. I have also used sysbench to run in parallel with thousand of those statements. 

Still, the test succeeds without a single error.

I am trying now with tens of thousands of statements. If you do not here from me, that means that I still can not repeat the problem that you observe.
[22 Jun 2020 15:51] MySQL Verification Team
I could repeat it first time I tried it manually.
1.  create table and insert those 5 large rows.
2.  mysqlslap in 1 thread "insert into t1(d1) values (repeat(UNHEX('000f'), 1022*3));"
3.  execute ddl:

mysql> alter table t1 ADD COLUMN `d2` char(128) not null default '0' after d1;
ERROR 1034 (HY000): Incorrect key file for table 't1'; try to repair it
mysql> 

Error log:

Version: '5.7.32'  socket: '/tmp/mysql.sock'  port: 3306  (Built on 16 June 2020, commit 132a1a7e7c7e8352c4c29083b446d23074791e87 with g++ (GCC) 10.0.1 20200423 (experimental))
2020-06-22T15:47:32.878959Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5138ms. The settings might not be optimal. (flushed=1766 and evicted=0, during the time.)
2020-06-22T15:47:44.012841Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5039ms. The settings might not be optimal. (flushed=1867 and evicted=0, during the time.)
2020-06-22T15:47:53.070109Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6139ms. The settings might not be optimal. (flushed=2005 and evicted=0, during the time.)
2020-06-22T15:48:06.172257Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6198ms. The settings might not be optimal. (flushed=2007 and evicted=0, during the time.)
2020-06-22T15:48:22.251631Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5473ms. The settings might not be optimal. (flushed=1400 and evicted=0, during the time.)
2020-06-22T15:48:38.133900Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5339ms. The settings might not be optimal. (flushed=1380 and evicted=0, during the time.)
2020-06-22T15:48:46.629315Z 3 [ERROR] InnoDB: Encryption algorithm support missing: N
2020-06-22T15:48:46.629363Z 3 [ERROR] InnoDB: Unable to read temporary file for table test/t1

=========

mysql> check table t1;
+---------+-------+----------+----------+
| Table   | Op    | Msg_type | Msg_text |
+---------+-------+----------+----------+
| test.t1 | check | status   | OK       |
+---------+-------+----------+----------+
1 row in set (0.58 sec)

mysql> show table status like 't1' \G
*************************** 1. row ***************************
           Name: t1
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 59
 Avg_row_length: 26765346
    Data_length: 1579155456
Max_data_length: 0
   Index_length: 0
      Data_free: 7340032
 Auto_increment: NULL
    Create_time: 2020-06-22 16:40:42
    Update_time: 2020-06-22 17:48:51
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options: 
        Comment: 
1 row in set (0.00 sec)

mysql> select count(*) from t1;
+----------+
| count(*) |
+----------+
|    79874 |
+----------+
1 row in set (0.05 sec)
[23 Jun 2020 12:06] MySQL Verification Team
Hi Mr. Zhou,

We were finally being able to repeat your test case.

Verified as reported.
[23 Jun 2020 12:43] MySQL Verification Team
I would like to add a note that I have tested this bug on the latest 8.0.

So, this is a bug affecting both latest 5.7 and latest 8.0.
[23 Jun 2020 15:02] MySQL Verification Team
In 8.0 there's no such problem for me. column got instantly added.
[3 Aug 2020 12:50] Anan R
Even i am also facing this bug in our of the environment. What the fix of this issue please as its affecting our environment badly.
[12 Feb 2021 14:04] Daniel Price
Posted by developer:
 
Fixed as of the upcoming 8.0.24 release, and here's the proposed changelog entry from the documentation team:

An online ALTER TABLE operation failed with an Incorrect key file for
table error due to an unnecessary encryption status check that was
performed when reading online DDL row logs.
[12 Feb 2021 14:42] MySQL Verification Team
Thank you, Daniel.
[20 Apr 2021 12:33] Haixing Weng
Hi, Daniel! Can't find release notes on 8.0.24 release notes on mysql documentation.
[15 Apr 2023 10:41] Przemyslaw Malkowski
Interestingly, there is a note the bug was fixed as of 5.7.34:
https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-34.html
but the corresponding 8.0 release does not have it:
https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-24.html
[19 Apr 2023 12:19] MySQL Verification Team
Hi,

That only means that a bug does not exist any more in 8.0.
[20 Apr 2023 13:12] Jon Stephens
This bug is also fixed in MySQL 8.0.24. Changelog entry as above.

No other changes.
[20 Apr 2023 13:20] MySQL Verification Team
Thank you, Jon .....