Bug #87225 assert failure: !sdi_insert_failed || trx_is_interrupted(trx)
Submitted: 27 Jul 2017 15:38 Modified: 23 Oct 2017 18:07
Reporter: Naga Satyanarayana Bodapati Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0.3 OS:Any
Assigned to: CPU Architecture:Any

[27 Jul 2017 15:38] Naga Satyanarayana Bodapati
Description:
The assert failure is seen in WL#9536 branch with RQG tests tablespace grammar test.

Pre-conditions:

1. All tables belong to same schema
2. All tables in same tabespace
3. Concurrently executed DDLs

Examples:
conn1:
CREATE TABLE test.t1(a INT) ENGINE=InnoDB TABLESPACE='ts1';
conn2:
CREATE TABLE test.t2(a INT) ENGINE=InnoDB TABLESPACE='ts1';
conn3:
CREATE TABLE test.t3(a INT) ENGINE=InnoDB TABLESPACE='ts1';

In this example,  every create table serializes two SDIs, TABLE SDI (dd::Table) & Schema SDI (dd::Schema)

Since all these tables are in the tablespace ts1, all CREATE TABLES will try to insert the same 'test' schema SDI
into ts1 tablespace. This leads to lock wait timeout or even deadlock. (If interested, I can give mtr testcase
that simulates lock_wait_timeout/deadlock with a simple table)

Problems:
-------------
1. One long running DDL on table in tablespace, can block another unrelated table DDL as long as they belong to same schema.
2. DDLs on two unrelated tables (but with same schema), one will fail and gets rolledback
3. DDLs on two unrelated tables (but with same schema), are now serialized on inserting schema SDI

Lets take a step back and see why are inserting schema SDI along with table?

1.  When we create schema, there is no tablespace or table, we create schema SDI and store it in a file.
2.  Storing schema along with table, helps us in creating the schema automatically on  IMPORT

As such, there is no relation between schema and tablespace. The only relation is that a table in tablespace
belongs to some schema.

Ideally, we should serialize only the objects created as part of DDL trx. So

1) a table creation should ideally insert only table SDI
2) a tablespace creation should only serialize tablespace SDI. (for file per table tablespace,
we create both table & tablespace, so both should be serialized).
3) schema creation should serialize schema SDI

Proposed solutions:
--------------------------

1. Don't store schema SDI along with Table SDI.

    If the schema SDI's are not stored, on IMPORT, we either create the referred schema's with default collation
    or we report error to user and ask user to create all required schema's first

    One minor problem with auto-creation of schema during IMPORT is that, the schema will be created with default
    utf8mb4 collation. So, all newly created tables in that schema, will use 'utf8mb4'.

2. Don't store schema SDI along with Table SDI, but serialize to tablespace only on export.

    This is variation to (1), table creation will not serialize the schema SDI but on export, server has to
    get all the schemas of all the tables in tablespace, store all schema SDIs in tabelspace.

    In this approach, the original schema attributes(like collation) are maintained. The extra work of serialzing all
    schema of tables in tablespace is done only when we need it, i.e during EXPORT.

How to repeat:
run RQG tablespace grammar

Suggested fix:
See proposed solution in description
[27 Jul 2017 17:16] Naga Satyanarayana Bodapati
Posted by developer:
 
while waiting for reply from runtime team:

This is the workaround:

diff --git a/storage/innobase/api/api0api.cc b/storage/innobase/api/api0api.cc
index bfdad8c0484..2629fc23441 100644
--- a/storage/innobase/api/api0api.cc
+++ b/storage/innobase/api/api0api.cc
@@ -3272,6 +3272,64 @@ ib_sdi_open_table(
 	return(err);
 }
 
+
+/* TODO: remove this hack after discussing with runtime team */
+static
+dberr_t
+ib_sdi_insert_schema(
+	uint32_t		space_id,
+	const ib_sdi_key_t*	ib_sdi_key,
+	uint32_t		uncomp_len,
+	uint32_t		comp_len,
+	const void*		sdi)
+{
+	/* use separate trx for schema SDI */
+	/* do non-locking MVCC read to see schema SDI already exists */
+	/* If exists, just return, we never update schema SDI */
+	/* If doesn't exist and we get DB_LOCK_WAIT/DB_DEADLOCK due to
+	concurrent inserts of same schema SDI, retry until success */
+	trx_t* trx = trx_allocate_for_mysql();
+	trx_start_internal(trx);
+
+	ib_crsr_t	ib_crsr = NULL;
+	ib_err_t	err = ib_sdi_open_table(
+		space_id, trx, &ib_crsr);
+
+	if (err != DB_SUCCESS) {
+		trx_commit_for_mysql(trx);
+		trx_free_for_mysql(trx);
+		return(err);
+	}
+
+	ib_tpl_t	schema_tuple = ib_sdi_create_insert_tuple(
+		ib_crsr, ib_sdi_key->sdi_key, uncomp_len,
+		comp_len, sdi);
+
+	ib_cursor_set_lock_mode(ib_crsr, IB_LOCK_X);
+	ib_tpl_t	key_tpl = ib_sdi_create_search_tuple(
+		ib_crsr, ib_sdi_key->sdi_key);
+
+	ib_cursor_set_match_mode(ib_crsr, IB_EXACT_MATCH);
+	ib_cursor_set_lock_mode(ib_crsr, IB_LOCK_NONE);
+	err = ib_cursor_moveto(ib_crsr, key_tpl, IB_CUR_LE, 0);
+
+	if (err == DB_RECORD_NOT_FOUND) {
+		do {
+			/* Do insert. handle lock wait timeout and deadlock */
+			err = ib_cursor_insert_row(ib_crsr, schema_tuple);
+		} while (err == DB_LOCK_WAIT_TIMEOUT || err == DB_DEADLOCK);
+	}
+
+	ut_ad(err == DB_SUCCESS || err == DB_DUPLICATE_KEY);
+
+	ib_tuple_delete(key_tpl);
+	ib_tuple_delete(schema_tuple);
+	ib_cursor_close(ib_crsr);
+	trx_commit_for_mysql(trx);
+	trx_free_for_mysql(trx);
+	return(DB_SUCCESS);
+}
+
 /** Insert/Update SDI in tablespace
 @param[in]	tablespace_id	tablespace id
 @param[in]	ib_sdi_key	SDI key to uniquely identify the tablespace
@@ -3300,6 +3358,11 @@ ib_sdi_set(
 			<< " sdi_len: " << comp_len;
 	);
 
+	/* TODO: remove this hack for handling Schema SDI */
+	if (ib_sdi_key->sdi_key->type == 0) {
+		return(ib_sdi_insert_schema(tablespace_id, ib_sdi_key, uncomp_len, comp_len, sdi));
+	}
+
 	ib_crsr_t	ib_crsr = NULL;
 	ib_err_t	err = ib_sdi_open_table(
 		tablespace_id, trx, &ib_crsr);
[27 Sep 2017 14:07] Naga Satyanarayana Bodapati
Posted by developer:
 
Requesting re-triage:
1. It is no longer WL bug(so SRFEATURE doesn't make sense)
2. It is 8.0 & trunk bug now

Last I heard from Dyre:

"
Hi Satya,

we have decided that we want to stop writing schema SDIs altogether.
As soon as we get the approval from Rune, will file a bug for this and work to get that into RC2.
"

As part of this bug fix, workaround has to removed:

See storage/innobase/api/api0api.cc
ib_sdi_set():

   /* TODO: remove this workaround for handling Schema SDI
        (Bug#26539665) */
        if (ib_sdi_key->sdi_key->type == 0) {
                return(ib_sdi_insert_schema(tablespace_id, ib_sdi_key,
                                            uncomp_len, comp_len, sdi));
        }

Remove ib_sdi_insert_schema() completely.
[23 Oct 2017 18:07] Paul Dubois
Posted by developer:
 
Fixed in 8.0.4, 9.0.0.

Parallel inserts of schema SDI into the SDI B-tree could raise an
assertion when creating tables in the same schema in parallal.