Description:
One of the queries in the list of pre-upgrade checks for version upgrade, related to getDanglingFulltextIndex is performing very bad when there are a lot of schema objects in the database server.
"id": "getDanglingFulltextIndex",
"title": "Tables with dangling FULLTEXT index reference",
"description": "Error: The following tables contain dangling FULLTEXT index which is not supported. It is recommended to rebuild the table before upgrade."
SELECT TABLE_SCHEMA, TABLE_NAME, CONCAT('Table `', TABLE_SCHEMA, '.', TABLE_NAME, '` contains dangling FULLTEXT index. Kindly recreate the table before upgrade.') AS RESULT FROM ( SELECT replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(ist.NAME,'/',1),'@0040','@'),'@0025','%'),'@0023','#'),'@0024','$'),'@0021','!'),'@005e','^'),'@002a','*'),'@003f','?'),'@002e','.'),'@003e','>'),'@003c','<'),'@007c','|'),'@002d','-'),'@002b','+'),'@002c',','),'@0020',' ') AS TABLE_SCHEMA,replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(substring_index(ist.NAME,'/',-1),'#P',1),'@0040','@'),'@0025','%'),'@0023','#'),'@0024','$'),'@0021','!'),'@005e','^'),'@002a','*'),'@003f','?'),'@002e','.'),'@003e','>'),'@003c','<'),'@007c','|'),'@002d','-'),'@002b','+'),'@002c',','),'@0020',' ') AS TABLE_NAME FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES ist JOIN INFORMATION_SCHEMA.INNODB_SYS_COLUMNS isc ON ist.TABLE_ID = isc.TABLE_ID WHERE isc.NAME = 'FTS_DOC_ID' AND ist.TABLE_ID NOT IN ( SELECT DISTINCT TABLE_ID FROM INFORMATION_SCHEMA.INNODB_SYS_INDEXES WHERE TYPE = 32 ) ) AS R
I suspect the use of subquery is causing it to slow down, especially when queried across a large number of objects like :
1024510 objects in INFORMATION_SCHEMA.INNODB_SYS_INDEXES
5926206 in INFORMATION_SCHEMA.INNODB_SYS_COLUMNS
386748 in INFORMATION_SCHEMA.INNODB_SYS_TABLES
This is inturn causing a lot of time for the upgrade checks to complete, thereby extending the whole version upgrade process.
How to repeat:
Provision a database with a large number of database objects. For example,
SELECT COUNT(*) FROM INFORMATION_SCHEMA.INNODB_SYS_INDEXES -- 1024510
SELECT COUNT(*) FROM INFORMATION_SCHEMA.INNODB_SYS_COLUMNS -- 5926206
SELECT COUNT(*) FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES -- 386748
Execute the following query (which is executed as part of upgrade checker)
SELECT TABLE_SCHEMA, TABLE_NAME, CONCAT('Table `', TABLE_SCHEMA, '.', TABLE_NAME, '` contains dangling FULLTEXT index. Kindly recreate the table before upgrade.') AS RESULT FROM ( SELECT replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(ist.NAME,'/',1),'@0040','@'),'@0025','%'),'@0023','#'),'@0024','$'),'@0021','!'),'@005e','^'),'@002a','*'),'@003f','?'),'@002e','.'),'@003e','>'),'@003c','<'),'@007c','|'),'@002d','-'),'@002b','+'),'@002c',','),'@0020',' ') AS TABLE_SCHEMA,replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(substring_index(ist.NAME,'/',-1),'#P',1),'@0040','@'),'@0025','%'),'@0023','#'),'@0024','$'),'@0021','!'),'@005e','^'),'@002a','*'),'@003f','?'),'@002e','.'),'@003e','>'),'@003c','<'),'@007c','|'),'@002d','-'),'@002b','+'),'@002c',','),'@0020',' ') AS TABLE_NAME FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES ist JOIN INFORMATION_SCHEMA.INNODB_SYS_COLUMNS isc ON ist.TABLE_ID = isc.TABLE_ID WHERE isc.NAME = 'FTS_DOC_ID' AND ist.TABLE_ID NOT IN ( SELECT DISTINCT TABLE_ID FROM INFORMATION_SCHEMA.INNODB_SYS_INDEXES WHERE TYPE = 32 ) ) AS R
This takes close to 2 hours on average on an a 32 vCPU 256 GB instance.
One such instance is as follows. It took 6422 seconds (1 Hr 47 minutes)
# Query_time: 6422.181766 Lock_time: 0.000254 Rows_sent: 69 Rows_examined: 2334414243
SET timestamp=1683584576;
SELECT TABLE_SCHEMA, TABLE_NAME, CONCAT('Table `', TABLE_SCHEMA, '.', TABLE_NAME, '` contains dangling FULLTEXT index. Kindly recreate the table before upgrade.') AS RESULT FROM ( SELECT replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(ist.NAME,'/',1),'@0040','@'),'@0025','%'),'@0023','#'),'@0024','$'),'@0021','!'),'@005e','^'),'@002a','*'),'@003f','?'),'@002e','.'),'@003e','>'),'@003c','<'),'@007c','|'),'@002d','-'),'@002b','+'),'@002c',','),'@0020',' ') AS TABLE_SCHEMA,replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(substring_index(ist.NAME,'/',-1),'#P',1),'@0040','@'),'@0025','%'),'@0023','#'),'@0024','$'),'@0021','!'),'@005e','^'),'@002a','*'),'@003f','?'),'@002e','.'),'@003e','>'),'@003c','<'),'@007c','|'),'@002d','-'),'@002b','+'),'@002c',','),'@0020',' ') AS TABLE_NAME FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES ist JOIN INFORMATION_SCHEMA.INNODB_SYS_COLUMNS isc ON ist.TABLE_ID = isc.TABLE_ID WHERE isc.NAME = 'FTS_DOC_ID' AND ist.TABLE_ID NOT IN ( SELECT DISTINCT TABLE_ID FROM INFORMATION_SCHEMA.INNODB_SYS_INDEXES WHERE TYPE = 32 ) ) AS R;
Suggested fix:
Avoid Subquery and split the query as two separate queries. One to get the concatenated list of TABLE_IDs and the second to get the details of the table schema and the table name of the corresponding TABLE_IDs.
SELECT GROUP_CONCAT(C.TABLE_ID ORDER BY 1) AS TABLE_ID
FROM INFORMATION_SCHEMA.INNODB_SYS_COLUMNS C
LEFT JOIN
(
SELECT DISTINCT TABLE_ID
FROM INFORMATION_SCHEMA.INNODB_SYS_INDEXES
WHERE TYPE=32
) I ON C.TABLE_ID=I.TABLE_ID
WHERE C.NAME = 'FTS_DOC_ID'
AND I.TABLE_ID IS NULL;
SELECT
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(SUBSTRING_INDEX(NAME,'/',1),'@0040','@'),'@0025','%'),'@0023','#'),'@0024','$'),'@0021','!'),'@005e','^'),'@002a','*'),'@003f','?'),'@002e','.'),'@003e','>'),'@003c','<'),'@007c','|'),'@002d','-'),'@002b','+'),'@002c',','),'@0020',' ') AS TABLE_SCHEMA
,REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(SUBSTRING_INDEX(SUBSTRING_INDEX(NAME,'/',-1),'#P',1),'@0040','@'),'@0025','%'),'@0023','#'),'@0024','$'),'@0021','!'),'@005e','^'),'@002a','*'),'@003f','?'),'@002e','.'),'@003e','>'),'@003c','<'),'@007c','|'),'@002d','-'),'@002b','+'),'@002c',','),'@0020',' ') AS TABLE_NAME
FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES
WHERE TABLE_ID IN (); /*Substitute value from Previous Query */
This improves the query performance tremendously. The first Query returns results within 10-15 seconds on average and the second Query returns results within 1-2 seconds on average.
One such instance is as follows.
# Query_time: 12.488177 Lock_time: 0.000140 Rows_sent: 1 Rows_examined: 6956918
SET timestamp=1683599236;
SELECT GROUP_CONCAT(C.TABLE_ID ORDER BY 1) AS TABLE_ID
FROM INFORMATION_SCHEMA.INNODB_SYS_COLUMNS C
LEFT JOIN
(
SELECT DISTINCT TABLE_ID
FROM INFORMATION_SCHEMA.INNODB_SYS_INDEXES
WHERE TYPE=32
) I ON C.TABLE_ID=I.TABLE_ID
WHERE C.NAME = 'FTS_DOC_ID'
AND I.TABLE_ID IS NULL;
With this approach, the pre-upgrade check process can finish pretty fast and it can avoid the need to wait for hours only to know that the upgrade has failed.