Description:
It is easy to crash InnoDB:
1) run a DDL statement
2) make that statement fail because there are too many (1023) undo slots in use
Given that InnoDB does not make use of the error injection facility in mtr I will guess that there are no deterministic error injection tests for it. Code inspection shows a few too many assert (ut_a) statements.
The stack for my crash is below. There is no query text because of another bug in 5.1 that prevented the query text from being dumped in most crashes.
0x7be9d2 que_eval_sql + 290
0x7c7aa1 row_merge_drop_index + 113
0x7c7b74 row_merge_drop_indexes + 68
0x786111 _ZN11ha_innobase9add_indexEP8st_tableP6st_keyj + 2689
0x6f13be _Z17mysql_alter_tableP3THDPcS1_P24st_ha_create_informationP10TABLE_LISTP10Alter_infojP8st_orderb + 6078
0x5f5a00 _Z21mysql_execute_commandP3THDPy + 11664
0x5faa23 _Z11mysql_parseP3THDPcjPPKcPy + 803
0x5fbf5c _Z16dispatch_command19enum_server_commandP3THDPcj + 5276
0x5fc5b3 _Z10do_commandP3THD + 275
0x5ebeaa handle_one_connection + 1994
0x375ae062f7 _end + 1465968927
0x375a6d1e3d _end + 1458414693
How to repeat:
I think I hit this assert. Note that when que_eval_sql returns something other than DB_SUCCESS then trx->error_state is assigned that value.
que_eval_sql(
/*=========*/
pars_info_t* info, /*!< in: info struct, or NULL */
const char* sql, /*!< in: SQL string */
ibool reserve_dict_mutex,
/*!< in: if TRUE, acquire/release
dict_sys->mutex around call to pars_sql. */
trx_t* trx) /*!< in: trx */
{
que_thr_t* thr;
que_t* graph;
ut_a(trx->error_state == DB_SUCCESS);
...
return(trx->error_state);
}
row_merge_drop_index has this assert that worries me, but I don't think it caused a crash in this case:
err = que_eval_sql(info, str1, FALSE, trx);
ut_a(err == DB_SUCCESS);
What I think happened is that trx->error_state != DB_SUCCESS on entry to the row_merge_drop_indexes call. Then, row_merge_drop_indexes calls row_merge_drop_index which calls row_merge_drop_index. Since trx->error_state != DB_SUCCESS on entry, the assert is raised.
This code in alter_index calls row_merge_drop_indexes on an error and I assume that trx->error_state might != DB_SUCCESS when that is done which can explain this assert.
if (!new_primary) {
error = row_merge_rename_indexes(trx, indexed_table);
if (error != DB_SUCCESS) {
row_merge_drop_indexes(trx, indexed_table,
index, num_created);
}
Later in alter_index, there is code to reset trx->error_state before calling row_merge_drop_indexes:
default:
trx->error_state = DB_SUCCESS;
if (new_primary) {
if (indexed_table != innodb_table) {
row_merge_drop_table(trx, indexed_table);
}
} else {
if (!dict_locked) {
row_mysql_lock_data_dictionary(trx);
dict_locked = TRUE;
}
row_merge_drop_indexes(trx, indexed_table,
index, num_created);
}
Suggested fix:
1) Add error injection tests for mtr
2) Clear trx->error_state in error handlers before calling functions that call que_eval_sql
3) Confirm that it is OK to assert that the return value from que_eval_sql == DB_SUCCESS
Crashes during DDL are very painful because DDL isn't atomic in MySQL so the MySQL and InnoDB dictionaries can easily get out of sync and that requires manual intervention. It also is painful because DDL operations are frequently very long running and nobody wants to repeat a long running DDL statement.