Description:
Hi Guys,
Got myself into a strange scenario -not sure if it's a bug or correct behavior - we have 4 nodes innodb cluster
db1 R/W Primary
db2 R Secondary
db3 R Secondary
db4 R Secondary
I needed to run an optimize query on a huge 1.2tb table (would save 300GB when completed) so i wanted to run it on each node without writing to binary - but i did a mistake in the query and forgot to add the LOCAL switch to skip logging.
so i ran a command on db4 by switching the read_only=0 while it's still a SECONDARY member.
only after db4 completed the optimize (about 8 hours) i noticed my mistake - at this point all other servers db1/db2/db3 started to run the query. the problem is that the Primary R/W instance failed to complete the query with this error
"2024-05-12T20:06:46.526751Z 14 [ERROR] [MY-013140] [Server] Error in diagnostics area: MY-001799 - Creating index 'my_index' required more than 'innodb_online_alter_log_max_size' bytes of modification log. Please try again."
while all other servers completed susscefully.
cant figure out how to retry the query and not if i cause some kind of split brain scenario.
so currently the status
db1 R/W Primary OPTIMIZE FAILED
db2 R Secondary OPTIMIZE COMPLETED
db3 R Secondary OPTIMIZE COMPLETED
db4 R Secondary OPTIMIZE COMPLETED
Replication currently works and shows no errors but i see the optimize query still being shown in the process list (although it failed)
My best guess would be that once i restart the db1 instance - master will failover to db2 - when db1 comes up again it will retry running the optimize query - can someone confirm this?
any help would be appreciated.
How to repeat:
explained above.
Suggested fix:
restart master will retry the query?