Bug #112348 Operator does not pickup changes to the pods
Submitted: 14 Sep 2023 15:18 Modified: 18 Sep 2023 17:22
Reporter: Scott Anderson Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Operator Severity:S2 (Serious)
Version:8.1.0-2.1.0 OS:Any (AKS kubernetes v1.26.6)
Assigned to: CPU Architecture:Any (AKS kubernetes v1.26.6)

[14 Sep 2023 15:18] Scott Anderson
Description:
After a period approximately 1 hour the operator does not seem to monitor the cluster for changes such as deletion of the pods or even replication issues.

We had an incident where we lost a node and no recovery was made by the operator to ensure consistency of the InnoDBCluster.

How to repeat:
Create a 3 instance InnoDBCluster, wait about an hour and delete one of the pods.  The pod is stopped but is not deleted due to waiting on the mysql-operator finalizer. The logs of the operator pods show no activity and seem to have not registered the loss of a pod.

Restarting any of the operator pods allows the pod to be deleted and a new pod starts up as expected with replication.  The problem also occurs when deleting a InnoDBCluster nothing happens until one of the operator pods is restarted.  

I must admit it seems that the operator is half finished not production ready. 

Suggested fix:
Make sure the operator is active and polls the condition of the InnoDBCluster at all times.

It seems to me that the operator is going to sleep and not responding to any changes to the InnoDBCluster.
[18 Sep 2023 17:22] MySQL Verification Team
Hi,

Thank you for report