MySQL Bugs: #107654: Finalizer stuck finalizing

Bug #107654	Finalizer stuck finalizing
Submitted:	25 Jun 2022 9:40	Modified:	4 Jul 2022 9:04
Reporter:	Rob Landers	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Operator	Severity:	S4 (Feature request)
Version:	8.0.29-2.0.4	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
Had a deployment get stuck finalizing and it never recovered. This was during drain-and-reboot of the entire cluster to install underlying OS security updates. The k8s cluster didn't reboot until the MySQL cluster was force deleted by removing the finalizer in the spec.

This resulted in downtime when it shouldn't have.

It looks like draining a node causes the storage to get detached before the finalizer completes. This causes the finalizer to crash and never recover.

How to repeat:
1. Install the mysql operator and use longhorn to manage storage
2. Deploy a simple MySQL cluster with the following values:

credentials:
  root:
    user: root
    password: password
    host: "%"
tls:
  useSelfSigned: true
serverInstances: 3
routerInstances: 2

3. Drain a node for reboot:

kubectl drain node1 --pod-selector='app!=csi-attacher,app!=csi-provisioner' --ignore-daemonsets --delete-emptydir-data

4. The node will never drain as the MySQL pod will never terminate.

Suggested fix:
Make the finalizer more resilient and handle the case where storage has vanished.

Hi,

Thank you for the report

any update on this bug?