Bug #119634 MySQL Operator backup can fail on Ready pods that haven't fully joined cluster yet
Submitted: 6 Jan 21:04
Reporter: Thomas Rock Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Operator Severity:S3 (Non-critical)
Version:9.4.0-2.2.5 OS:Any
Assigned to: CPU Architecture:Any

[6 Jan 21:04] Thomas Rock
Description:
I encountered this error when trying to run a backup when one pod in the innodb cluster was recovering from a restart.
```
Traceback (most recent call last):
  File "/usr/lib/mysqlsh/python-packages/mysqloperator/backup_main.py", line 411, in command_do_create_backup
    info = do_backup(backup, job_name, start, backup_dir, logger)
  File "/usr/lib/mysqlsh/python-packages/mysqloperator/backup_main.py", line 321, in do_backup
    backup_source = pick_source_instance(cluster, backup, profile, logger)
  File "/usr/lib/mysqlsh/python-packages/mysqloperator/backup_main.py", line 265, in pick_source_instance
    tmp = dba.get_cluster().status({"extended": 1})["defaultReplicaSet"]
          ~~~~~~~~~~~~~~~^^
RuntimeError: This function is not available through a session to an instance belonging to an unmanaged replication group

[ERROR] [backup] Backup failed with an exception: This function is not available through a session to an instance belonging to an unmanaged replication group

[INFO] [backup] Command execute-backup finished with code False
```

I believe this is due to the backup script selecting the pod that was not fully joined to the cluster yet. Once that pod joined the cluster, I was able to run backups successfully.

How to repeat:
Have 3 pod innodb cluster, restart one of the pods such that it comes back as ready, but not yet passing all readiness gates. Kick off a backup job 

Suggested fix:
https://github.com/mysql/mysql-operator/blob/3b14f90575d2df410548176e1fc6acf7c7f2ba17/mysq...

The code that attempts to get the cluster status from the instance should either handle the "RuntimeError: This function is not available through a session to an instance belonging to an unmanaged replication group" error, or we should check if the pod is Ready and passing all readiness checks (https://github.com/mysql/mysql-operator/blob/3b14f90575d2df410548176e1fc6acf7c7f2ba17/mysq...) before attempting to connect to it