Description:
In the load stage of ddl prcocess, it use multiple threads to merge-sort task, but it failed to create threads because of insufficient memory or other reason, the load stage will be stuck.
I dive into the source code of ddl0loader.cc. When thread creation fails, it will break the for loop and call thread_create_failed function, which reduces the m_n_threads counter, but the remaining threads are skipped and the counter only counts the first failed threads. So m_n_idle will never equal or exceed m_n_threads, it will wait forever.
How to repeat:
It is not easy to reproduce the thread creation failure, so I add a debug sync point in ddl0loader.cc to simulate the failure scenario. The following code is the diff patch:
```code
diff --git a/storage/innobase/ddl/ddl0loader.cc b/storage/innobase/ddl/ddl0loader.cc
index c912f815ce1..35800f48344 100644
--- a/storage/innobase/ddl/ddl0loader.cc
+++ b/storage/innobase/ddl/ddl0loader.cc
@@ -329,6 +329,11 @@ dberr_t Loader::load() noexcept {
for (size_t i = 1; i < m_ctx.m_max_threads; ++i) {
try {
+ IF_ENABLED("ddl_create_thread_fail", {
+ if (i == 1) {
+ throw "DDL simulate create thread failure";
+ }
+ })
threads.push_back(std::thread{fn, i});
} catch (...) {
ib::warn(ER_DDL_MSG_1);
@@ -414,6 +419,7 @@ dberr_t Loader::scan_and_build_indexes() noexcept {
SYNC_POINT_ADD(thd, "ddl_btree_build_interrupt");
SYNC_POINT_ADD(thd, "ddl_btree_build_sleep");
SYNC_POINT_ADD(thd, "ddl_btree_build_insert_return_interrupt");
+ SYNC_POINT_ADD(thd, "ddl_create_thread_fail");
auto cleanup = [&]() {
for (auto &s : sync_points) {
```code
and then run the following sql statements:
```sql
CREATE TABLE `t1` (
`id` int NOT NULL AUTO_INCREMENT,
`k` int NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
`lob_data` longblob,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO t1(k,c,pad,lob_data) VALUES(1,"a","b", repeat('a', 819));
INSERT INTO t1(k,c,pad,lob_data) VALUES(2,"a","b", repeat('b', 400));
INSERT INTO t1(k,c,pad,lob_data) VALUES(5,"c","z", repeat('c', 1024));
INSERT INTO t1(k,c,pad,lob_data) VALUES(2,"c","d", repeat('d', 32));
INSERT INTO t1(k,c,pad,lob_data) VALUES(4,"b","x", repeat('e', 768));
INSERT INTO t1(k,c,pad,lob_data) SELECT k,c,pad,lob_data from t1;
INSERT INTO t1(k,c,pad,lob_data) SELECT k,c,pad,lob_data from t1;
INSERT INTO t1(k,c,pad,lob_data) SELECT k,c,pad,lob_data from t1;
INSERT INTO t1(k,c,pad,lob_data) SELECT k,c,pad,lob_data from t1;
INSERT INTO t1(k,c,pad,lob_data) SELECT k,c,pad,lob_data from t1;
INSERT INTO t1(k,c,pad,lob_data) SELECT k,c,pad,lob_data from t1;
SET SESSION DEBUG = '+d,ddl_create_thread_fail';
ALTER TABLE t1 ADD INDEX idx_k(k);
```
the ddl statement will be blocked and you will get the following warning message in error log
```
27 2023-05-28T04:42:10.795465Z 13 [Warning] [MY-013778] [InnoDB] DDL failed to create a thread to load an index, fall back to single thread
```
```
Suggested fix:
Don`t break the creating threads loop, just continue to create the remaining threads, So the m_n_threads and m_n_idle will be in a consistent state.