MySQL Bugs: #68498: can online ddl for innodb be more online?

Bug #68498	can online ddl for innodb be more online?
Submitted:	26 Feb 2013 18:31	Modified:	4 Mar 2013 7:57
Reporter:	Mark Callaghan	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S4 (Feature request)
Version:	5.6.10	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
See http://mysqlha.blogspot.com/2013/02/mysql-56-online-ddl-for-busy-tables.html

Can we enhance InnoDB to apply most logged changes before getting an X lock and blocking concurrent changes? From the code I browsed, it blocks all concurrent changes while applying all logged changes. Facebook OSC applied most logged changes before doing that.

How to repeat:
read the code, especially row_log_apply

Suggested fix:
apply most of the logged changes before getting an X lock

I think I was wrong and this is a bogus request. After reading more code, looks like the X lock obtained by row_log_apply is occasionally released by row_log_apply_ops. Now that worklogs are published it would help the community if major features like this were described in detail in blog posts.

This is not a completely bogus request.

We could investigate if we can release and reacquire the index->lock more often. I initially wanted to hold it continuously for the last block application, because I thought that the log apply (ALTER TABLE thread) could require random access I/O (slow), while the DML threads would just do fast sequential writes to the log. That is, I feared that the log apply could fail to keep up with the DML on the last block. This assumption was never tested; maybe we can lift this limitation.

Another area of improvement inside InnoDB is that we should calculate index cardinality statistics while loading the sorted index entries. This would save some I/O, which is currently being performed under an exclusive meta-data lock in ha_innobase::commit_inplace_alter_table().

A third improvement could be in the MDL upgrade, but I am not sure if that would lead to starvation (more frequent MDL upgrade failure). We currently do not allow newcomers to take the lock while attempting to upgrade the MDL for the commit phase of ALTER. If we allowed that, we would be 'more online', but the ALTER might have to be rolled back more often due to lock upgrade timeout. 

It could also be worth investigating if we could merge the MDL and the InnoDB internal table locks and somehow get the MDL upgrade done more easily, but this is definitely out of scope for the 5.6 release.

Hello!

I am not sure that MDL upgrade that allows starvation makes much sense.

What in my opinion makes more sense is an option to do "aggressive" upgrade which will instead of waiting for transactions that used this table to finish, will abort them, thus dramatically reducing upgrade time in situations when there are a lot long-running transactions.

This bug is for tracking the InnoDB changes. The meta-data locking change (say, LOCK=NONE_WITH_AGGRESSIVE_UPGRADE) would have to be filed as a separate bug, if it is deemed useful.

"Another area of improvement inside InnoDB is that we should calculate index cardinality statistics while loading the sorted index entries."

Before jumping into doing that I think it would make sense to check what would be the theoretical maximum we could get by this improvement. IMO it could turn out to not have any practical benefit.

To do this we could completely disable the current stats gathering that is done afterwards index load in the current code. The improvement we would get if we gather the stats while loading the index will smaller than that.