Bug #107654 Finalizer stuck finalizing
Submitted: 25 Jun 2022 9:40 Modified: 4 Jul 2022 9:04
Reporter: Rob Landers Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Operator Severity:S4 (Feature request)
Version:8.0.29-2.0.4 OS:Any
Assigned to: CPU Architecture:Any

[25 Jun 2022 9:40] Rob Landers
Description:
Had a deployment get stuck finalizing and it never recovered. This was during drain-and-reboot of the entire cluster to install underlying OS security updates. The k8s cluster didn't reboot until the MySQL cluster was force deleted by removing the finalizer in the spec.

This resulted in downtime when it shouldn't have.

It looks like draining a node causes the storage to get detached before the finalizer completes. This causes the finalizer to crash and never recover.

How to repeat:
1. Install the mysql operator and use longhorn to manage storage
2. Deploy a simple MySQL cluster with the following values:

credentials:
  root:
    user: root
    password: password
    host: "%"
tls:
  useSelfSigned: true
serverInstances: 3
routerInstances: 2

3. Drain a node for reboot:

kubectl drain node1 --pod-selector='app!=csi-attacher,app!=csi-provisioner' --ignore-daemonsets --delete-emptydir-data

4. The node will never drain as the MySQL pod will never terminate.

Suggested fix:
Make the finalizer more resilient and handle the case where storage has vanished.
[4 Jul 2022 9:04] MySQL Verification Team
Hi,

Thank you for the report
[10 Apr 10:37] Ram ji
any update on this bug?