Description:
As we use external XA transaction on production, and we found a case that the crash recovery loops in function xa::recovery::recover_one_ht
the error log looks like this:
2023-01-29T19:56:06.450836+08:00 0 [Note] [MY-013032] [InnoDB] Starting recovery for XA transactions...
2023-01-29T19:56:07.306047+08:00 0 [Note] [MY-013035] [InnoDB] 65536 transactions in prepared state after recovery
2023-01-29T19:56:07.348717+08:00 0 [Note] [MY-010224] [Server] Found 65536 prepared transaction(s) in InnoDB
2023-01-29T19:56:07.403739+08:00 0 [Note] [MY-013032] [InnoDB] Starting recovery for XA transactions...
2023-01-29T19:56:08.261776+08:00 0 [Note] [MY-013035] [InnoDB] 65536 transactions in prepared state after recovery
2023-01-29T19:56:08.304281+08:00 0 [Note] [MY-010224] [Server] Found 65536 prepared transaction(s) in InnoDB
2023-01-29T19:56:08.358430+08:00 0 [Note] [MY-013032] [InnoDB] Starting recovery for XA transactions...
2023-01-29T19:56:09.214465+08:00 0 [Note] [MY-013035] [InnoDB] 65536 transactions in prepared state after recovery
2023-01-29T19:56:09.261115+08:00 0 [Note] [MY-010224] [Server] Found 65536 prepared transaction(s) in InnoDB
2023-01-29T19:56:09.320649+08:00 0 [Note] [MY-013032] [InnoDB] Starting recovery for XA transactions...
2023-01-29T19:56:10.181868+08:00 0 [Note] [MY-013035] [InnoDB] 65536 transactions in prepared state after recovery
2023-01-29T19:56:10.227623+08:00 0 [Note] [MY-010224] [Server] Found 65536 prepared transaction(s) in InnoDB
2023-01-29T19:56:10.295218+08:00 0 [Note] [MY-013032] [InnoDB] Starting recovery for XA transactions...
The recovered XA transaction is 71184, which is larger than info->len in xa::recovery::recover_one_ht(65536)
Let's check how info->len is determined:
in function ha_recover:
```
for (info.len = MAX_XID_LIST_SIZE;
info.list == nullptr && info.len > MIN_XID_LIST_SIZE; info.len /= 2) {
info.list = new (std::nothrow) XA_recover_txn[info.len];
}
```
It means the info.len could be 128 which in lack of memory.
And let's check the code in function xa::recovery::recover_one_ht:
It'll get prepared transactions from engine layer, suppose we have 200 XA transactions in prepared state. since the info.len is 128, it only gets 128 trxs from engine and add to xa cache.
It'll loop again, but actually the prepared XA transaction is not removed from trx list in innodb, it'll gain same 128 transactions, then loop and loop forever...
How to repeat:
In case to easily repeat the case, let's slightly change the code:
```
diff --git a/mysql-8.0.32/sql/xa.cc b/mysql-8.0.32/sql/xa.cc
index 63491b4..5db15bf 100644
--- a/mysql-8.0.32/sql/xa.cc
+++ b/mysql-8.0.32/sql/xa.cc
@@ -308,11 +308,16 @@ int ha_recover(Xid_commit_list *commit_list, Xa_state_list *xa_list) {
tc_heuristic_recover = TC_HEURISTIC_RECOVER_ROLLBACK; // forcing ROLLBACK
info.dry_run = false;
}
-
+#if 0
for (info.len = MAX_XID_LIST_SIZE;
info.list == nullptr && info.len > MIN_XID_LIST_SIZE; info.len /= 2) {
info.list = new (std::nothrow) XA_recover_txn[info.len];
}
+#else
+ info.len = 4;
+ info.list = new (std::nothrow) XA_recover_txn[info.len];
+#endif
+
if (!info.list) {
LogErr(ERROR_LEVEL, ER_SERVER_OUTOFMEMORY,
static_cast<int>(info.len * sizeof(XID)));
```
Let's hardcode info.len to 4, and then prepare 5 xa transactions:
root@(none) 07:49:02>xa recover;
+----------+--------------+--------------+------+
| formatID | gtrid_length | bqual_length | data |
+----------+--------------+--------------+------+
| 1 | 3 | 0 | 567 |
| 1 | 3 | 0 | 456 |
| 1 | 3 | 0 | 345 |
| 1 | 3 | 0 | 234 |
| 1 | 3 | 0 | 123 |
+----------+--------------+--------------+------+
5 rows in set (0.00 sec)
then kill the server and restart, check error log:
2023-01-29T11:50:13.437808Z 0 [Note] [MY-013034] [InnoDB] Transaction contains changes to 1 rows
2023-01-29T11:50:13.437813Z 0 [Note] [MY-013035] [InnoDB] 4 transactions in prepared state after recovery
2023-01-29T11:50:13.437818Z 0 [Note] [MY-010224] [Server] Found 4 prepared transaction(s) in InnoDB
2023-01-29T11:50:13.437825Z 0 [Note] [MY-013032] [InnoDB] Starting recovery for XA transactions...
2023-01-29T11:50:13.437830Z 0 [Note] [MY-013033] [InnoDB] Transaction 497328007 in prepared state after recovery
2023-01-29T11:50:13.437835Z 0 [Note] [MY-013034] [InnoDB] Transaction contains changes to 1 rows
2023-01-29T11:50:13.437841Z 0 [Note] [MY-013033] [InnoDB] Transaction 497328005 in prepared state after recovery
2023-01-29T11:50:13.437846Z 0 [Note] [MY-013034] [InnoDB] Transaction contains changes to 1 rows
2023-01-29T11:50:13.437852Z 0 [Note] [MY-013033] [InnoDB] Transaction 497328003 in prepared state after recovery
2023-01-29T11:50:13.437858Z 0 [Note] [MY-013034] [InnoDB] Transaction contains changes to 1 rows
2023-01-29T11:50:13.437864Z 0 [Note] [MY-013033] [InnoDB] Transaction 497328001 in prepared state after recovery
2023-01-29T11:50:13.437869Z 0 [Note] [MY-013034] [InnoDB] Transaction contains changes to 1 rows
2023-01-29T11:50:13.437874Z 0 [Note] [MY-013035] [InnoDB] 4 transactions in prepared state after recovery
2023-01-29T11:50:13.437879Z 0 [Note] [MY-010224] [Server] Found 4 prepared transaction(s) in InnoDB
2023-01-29T11:50:13.437886Z 0 [Note] [MY-013032] [InnoDB] Starting recovery for XA transactions...
2023-01-29T11:50:13.437891Z 0 [Note] [MY-013033] [InnoDB] Transaction 497328007 in prepared state after recovery
2023-01-29T11:50:13.437896Z 0 [Note] [MY-013034] [InnoDB] Transaction contains changes to 1 rows
2023-01-29T11:50:13.437902Z 0 [Note] [MY-013033] [InnoDB] Transaction 497328005 in prepared state after recovery
2023-01-29T11:50:13.437907Z 0 [Note] [MY-013034] [InnoDB] Transaction contains changes to 1 rows
2023-01-29T11:50:13.437914Z 0 [Note] [MY-013033] [InnoDB] Transaction 497328003 in prepared state after recovery
2023-01-29T11:50:13.437919Z 0 [Note] [MY-013034] [InnoDB] Transaction contains changes to 1 rows
2023-01-29T11:50:13.437925Z 0 [Note] [MY-013033] [InnoDB] Transaction 497328001 in prepared state after recovery
2023-01-29T11:50:13.437930Z 0 [Note] [MY-013034] [InnoDB] Transaction contains changes to 1 rows
It loops forever and can't startup the instance
Suggested fix:
tag the transaction that has been scanned and ignore it in next loop