Bug #99174 Prepare transaction won't rollback if server crash after online ddl prepare stag
Submitted: 3 Apr 2020 13:51 Modified: 25 Feb 2:50
Reporter: zhang xiaojian Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DDL Severity:S1 (Critical)
Version:8.0.* OS:Any
Assigned to: CPU Architecture:Any
Tags: DDL, prepare trx

[3 Apr 2020 13:51] zhang xiaojian
Description:
An inplace online DDL may create a temp file and update the dd:Tablespace  at prepare stage.  The DML for that table is allowed after prepare stage by downgrade MLD lock.

If the server crash when 
(1). online DDL is copying data from original table to the temp table
(2). some DML prepared their transaction but not write binlog events to disk.  The transaction would not rollback after restart. So the data is inconsistent with binlog.

Only the type of DDL which update the dd::tablespace id can lead to this situation. For example drop column. 

How to repeat:
Step 1:  Add the code to sleep after online ddl prepare stage.

```cpp
--- a/sql/sql_table.cc
+++ b/sql/sql_table.cc
@@ -12791,6 +12791,8 @@ static bool mysql_inplace_alter_table(
     DEBUG_SYNC(thd, "alter_table_inplace_after_lock_downgrade");
     THD_STAGE_INFO(thd, stage_alter_inplace);

+    DBUG_EXECUTE_IF("alter_table_inplace_after_lock_sleep", sleep(100));
+
     if (table->file->ha_inplace_alter_table(altered_table, ha_alter_info,
                                             table_def, altered_table_def)) {
       goto rollback;
```

Step 2: 

```sql
set global innodb_print_ddl_logs = ON;

```

Step 3: 

```sql
mysql> CREATE DATABASE test;
Query OK, 1 row affected (0.02 sec)

mysql> USE test;
Database changed
mysql> CREATE TABLE tt (a INT PRIMARY KEY, b INT);
Query OK, 0 rows affected (0.06 sec)

mysql> insert into tt values(1,1), (2,2), (3,3), (4,4);
Query OK, 4 rows affected (0.00 sec)
```

Step 4: 

Session 1:
```sql
mysql> set @@global.debug = "+d, alter_table_inplace_after_lock_sleep";
Query OK, 0 rows affected (0.00 sec)
```

Sesssion 2:
```sql
mysql> set @@global.debug = "+d, crash_after_flush_engine_log";
Query OK, 0 rows affected (0.00 sec)
```

Session 1:
```sql
alter table tt drop column b;
```

Wait for DDL log has [DDL record: DELETE SPACE], [DDL record: REMOVE CACHE], [DDL record: FREE]. Or just wait for several seconds, make sure DDL is wait at ‘alter_table_inplace_after_lock_sleep’.

```

2020-04-03T13:24:36.204081Z 8 [Note] [MY-012473] [InnoDB] DDL log insert : [DDL record: DELETE SPACE, id=947, thread_id=8, space_id=13, old_file_path=./test/#sql-ib1195-3326190084.ibd]
2020-04-03T13:24:36.204288Z 8 [Note] [MY-012478] [InnoDB] DDL log delete : 947
2020-04-03T13:24:36.295338Z 8 [Note] [MY-012477] [InnoDB] DDL log insert : [DDL record: REMOVE CACHE, id=948, thread_id=8, table_id=1196, new_file_path=test/#sql-ib1195-3326190084]
2020-04-03T13:24:36.295518Z 8 [Note] [MY-012478] [InnoDB] DDL log delete : 948
2020-04-03T13:24:36.296273Z 8 [Note] [MY-012472] [InnoDB] DDL log insert : [DDL record: FREE, id=949, thread_id=8, space_id=13, index_id=351, page_no=4]
2020-04-03T13:24:36.296416Z 8 [Note] [MY-012478] [InnoDB] DDL log delete : 949

```

Session 2:
mysql> insert into tt values(10,10);
ERROR 2013 (HY000): Lost connection to MySQL server during query.

Step 5:  Restart the server

```sql
mysql> select * from test.tt;
+----+------+
| a  | b    |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
|  4 |    4 |
| 10 |   10 |
+----+------+
5 rows in set (0.00 sec)
```

The value (10,10) is still there, but not  include in binlog files !

Suggested fix:
The reason is alter table update the dd:Tablesapce but the trx is not commit(but redo log about the change may flushed). After server restart, all prepared transaction's table is collect at 'trx_resurrent_table_ids' and open the table at 'trx_resurrect_locks'. But in trx_resurrect_locks found the table's dd::Tablespace is not exist. Because we use the old dd::Tablespace id to read, and the the trx isolation is read uncommitted.

```cpp
dd_tablespace_is_implicit {

 const bool fail = (err1 = client->acquire_uncached_uncommitted<dd::Tablespace>(
                        dd_space_id, dd_space)) ||

}

```

But the crashed online DDL has an uncommitted transaction changed the dd::Tablespace id. So the table won't be opened and won't rollback during xa recovery.

Suggested fix: 
1. If we can move the DDL recover before xa recover.
2. Chang the transaction isolation level to RC.
[3 Apr 2020 14:04] MySQL Verification Team
Hi Mr. xiaojian,

Thank you for your bug report.

First of all, changing our server  code in order to cause a bug, can not be considered a repeatable test case.

Hence, can you supply us with a test case that will present the incorrect behaviour, without any code changes ???

Next, are there any settings that are needed in order to repeat the behaviour, without changing code. I can see that it should be REPEATABLE_READ isolation level. I also guess that binary logging is on, but did you use RBR or SBR ??? Are there any other settings that are affecting this ???

Last, but not least, I do not see the release that you are using. Have you tested this with our own 8.0.19 binary, which you can download from dev.mysql.com download pages ???

Thanks in advance for your responses.
[5 Apr 2020 15:58] zhang xiaojian
Hi, Sinisa,

The description above may be a little hard to understand, let's make life easier. 

First, download the leatest version 8.0.19 which recently commit id is ea7d2e2d16ac03afdd9cb72a972a95981107bf51.

Second, copy this mtr file to mysql-test/suite/innodb/t/innodb_bug99174.test

```mtr
-- source include/have_debug.inc
--source include/have_debug_sync.inc
--source include/have_log_bin.inc

connect (con1,localhost,root,,);

connection default;

CREATE TABLE t (a INT PRIMARY KEY, b INT NOT NULL) ENGINE=InnoDB;
INSERT INTO t VALUES(1,1), (2,2), (3,3), (4,4);

FLUSH BINARY LOGS;

SHOW BINARY LOGS;

SET DEBUG_SYNC='alter_table_inplace_after_lock_downgrade WAIT_FOR m';

--send
ALTER TABLE t DROP COLUMN b;

connection con1;

--exec echo "wait" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
--echo # Crash right after flushing InnoDB redo log
SET SESSION DEBUG="+d,crash_after_flush_engine_log";
BEGIN;
INSERT INTO t VALUES(10, 10);
# 2013 - CR_SERVER_LOST
--error 2013
COMMIT;
--source include/wait_until_disconnected.inc

connection default;
--error 2013
--reap

--enable_reconnect
--echo # Restart the master server
--exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
--source include/wait_until_connected_again.inc
--disable_reconnect

SELECT * FROM t;

SHOW BINARY LOGS;

# Cleanup
DROP TABLE t;

```

Third, run mtr test :

```bash
./mtr --nocheck-testcases innodb_bug99174
```

and we can generate the result file : 

```mtr result
CREATE TABLE t (a INT PRIMARY KEY, b INT NOT NULL) ENGINE=InnoDB;
INSERT INTO t VALUES(1,1), (2,2), (3,3), (4,4);
FLUSH BINARY LOGS;
SHOW BINARY LOGS;
Log_name        File_size       Encrypted
binlog.000001   732     No
binlog.000002   155     No
SET DEBUG_SYNC='alter_table_inplace_after_lock_downgrade WAIT_FOR m';
ALTER TABLE t DROP COLUMN b;
# Crash right after flushing InnoDB redo log
SET SESSION DEBUG="+d,crash_after_flush_engine_log";
BEGIN;
INSERT INTO t VALUES(10, 10);
COMMIT;
ERROR HY000: Lost connection to MySQL server during query
ERROR HY000: Lost connection to MySQL server during query
# Restart the master server
SELECT * FROM t;
a       b
1       1
2       2
3       3
4       4
10      10
SHOW BINARY LOGS;
Log_name        File_size       Encrypted
binlog.000001   732     No
binlog.000002   155     No
binlog.000003   155     No
DROP TABLE t;
[100%] innodb.innodb_bug99174                    [ pass ]   3468
```

The problem is table t have the value of (10, 10) after restart. But the binary log 'binlog.000002' does not have the events of (10, 10) insertion. 

The reason of the problem I have explained above, and I think it's a serious bug for data inconsistent of mater-slave. 

By the way, I don't agree with you that
>>  "changing our server  code in order to cause a bug,can not be considered a repeatable test case." 
 
What I changed is one more DBUG_EXECUTE_IF statement,  and many complex bug only can be repeated with DBUG_EXECUTE_IF/DBUG_SYNC...

Thanks for you fast responses.
[6 Apr 2020 12:41] MySQL Verification Team
Hello Mr. xiaojian,

I have tested your test case.

What I have got is exactly the same as you reported.

Verified as reported.
[24 Feb 13:52] Shaohua Wang
Any progress?
[24 Feb 13:54] MySQL Verification Team
Hi Mr. Wang,

Thank you for your enquiry.

However, our work finishes with bug verifications and we do not get any info on the scheduling of bug fixing.
[25 Feb 2:50] zhang xiaojian
I think it's a critical bug which cause data inconsistency, we reproduce it even in Database normal shutdown.
[25 Feb 13:03] MySQL Verification Team
Hi Mr. xiaojian,

We agree with you, which is why this bug is verified as a critical one.
[10 Mar 2:26] Shaohua Wang
Contributed patch

Attachment: bug99174.patch (application/octet-stream, text), 5.13 KiB.

[10 Mar 2:29] Shaohua Wang
The patch attached is a fix, please check.

The basic idea is resurrect locks for non-dictionary tables(non ddl operation transactions) after rollbacking or cleaning ddl transactions.
[10 Mar 13:09] MySQL Verification Team
Hi Mr. xiaojian,

Thank you for your contribution.