Bug #92547 | Contribution: Make MYSQL_INNODB_NUM_MEMBERS work with offline members | ||
---|---|---|---|
Submitted: | 24 Sep 2018 17:05 | Modified: | 2 Dec 2019 9:19 |
Reporter: | OCA Admin (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Package Repos | Severity: | S3 (Non-critical) |
Version: | OS: | Any | |
Assigned to: | CPU Architecture: | Any | |
Tags: | Contribution |
[24 Sep 2018 17:05]
OCA Admin
[24 Sep 2018 17:05]
OCA Admin
Contribution submitted via Github - Make MYSQL_INNODB_NUM_MEMBERS work with offline members (*) Contribution by Gianluca Borello (Github gianlucaborello, mysql-docker/pull/8#issuecomment-423215012): Thank you for your help. Yes, my name is already listed in the OCA list and have successfully contributed in other Oracle projects on GitHub using this username. As far as the other request, here is the explicit agreement of the OCA: I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it. Let me know if I should also send it via email. Thanks
Contribution: git_patch_216382185.txt (text/plain), 1.74 KiB.
[25 Sep 2018 3:14]
Patrick Galbraith
Hi there! I've had the same problem, within the context of Kubernetes and using the MySQL Operator and the wordpress-router demo in the operator source. 1. Create the database cluster, default with 3 nodes `kubectl create -f wordpress-database.yaml` 2. Create the wordpress deployment (includes mysql-router + wordpress app containers) `kubectl create -f wordpress-deployment.yaml` 3. Scale the cluster from 3 to 7: `kubectl edit cluster.mysql.oracle.com mysql-wordpress` Change `members: 3` to `members: 7` 4. Kill the wordpress-router pod `kubectl delete po wordpress-router-xxxxxxxx-nnnn` 5. Observe that it won't restart: ```wordpress-router-695dbcd6d-5bmsx 0/2 ImagePullBackOff 26 2h [opc@bastion-ad1 wordpress-router]$ kubectl describe po wordpress-router-695dbcd6d-5bmsx <snip> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning Failed 15m (x508 over 2h) kubelet, 129.213.45.45 Error: ImagePullBackOff Warning BackOff 5m (x482 over 2h) kubelet, 129.213.45.45 Back-off restarting failed container Normal BackOff 52s (x578 over 2h) kubelet, 129.213.45.45 Back-off pulling image "capttofu/mysql-router" [opc@bastion-ad1 wordpress-router]$ kubectl describe po wordpress-router-695dbcd6d-5bmsx ``` This needs to evaluate at true: ``` opc@bastion-ad1 wordpress-router]$ kubectl exec -it mysql-wordpress-0 -c mysql -- mysql -u root -pmy-super-secret-pass mysql_innodb_cluster_metadata -e 'select count(*) = 3 FROM instances WHERE replicaset_id = (SELECT replicaset_id FROM instances WHERE mysql_server_uuid = @@server_uuid);' mysql: [Warning] Using a password on the command line interface can be insecure. +--------------+ | count(*) = 3 | +--------------+ | 0 | +--------------+ ``` This patch looks promising, though there will be something else we look at for determining if the cluster is as it should be. I'm giving this a lot of thought.
[25 Sep 2018 3:14]
Patrick Galbraith
Hi there! I've had the same problem, within the context of Kubernetes and using the MySQL Operator and the wordpress-router demo in the operator source. 1. Create the database cluster, default with 3 nodes `kubectl create -f wordpress-database.yaml` 2. Create the wordpress deployment (includes mysql-router + wordpress app containers) `kubectl create -f wordpress-deployment.yaml` 3. Scale the cluster from 3 to 7: `kubectl edit cluster.mysql.oracle.com mysql-wordpress` Change `members: 3` to `members: 7` 4. Kill the wordpress-router pod `kubectl delete po wordpress-router-xxxxxxxx-nnnn` 5. Observe that it won't restart: ```wordpress-router-695dbcd6d-5bmsx 0/2 ImagePullBackOff 26 2h [opc@bastion-ad1 wordpress-router]$ kubectl describe po wordpress-router-695dbcd6d-5bmsx <snip> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning Failed 15m (x508 over 2h) kubelet, 129.213.45.45 Error: ImagePullBackOff Warning BackOff 5m (x482 over 2h) kubelet, 129.213.45.45 Back-off restarting failed container Normal BackOff 52s (x578 over 2h) kubelet, 129.213.45.45 Back-off pulling image "capttofu/mysql-router" [opc@bastion-ad1 wordpress-router]$ kubectl describe po wordpress-router-695dbcd6d-5bmsx ``` This needs to evaluate at true: ``` opc@bastion-ad1 wordpress-router]$ kubectl exec -it mysql-wordpress-0 -c mysql -- mysql -u root -pmy-super-secret-pass mysql_innodb_cluster_metadata -e 'select count(*) = 3 FROM instances WHERE replicaset_id = (SELECT replicaset_id FROM instances WHERE mysql_server_uuid = @@server_uuid);' mysql: [Warning] Using a password on the command line interface can be insecure. +--------------+ | count(*) = 3 | +--------------+ | 0 | +--------------+ ``` This patch looks promising, though there will be something else we look at for determining if the cluster is as it should be. I'm giving this a lot of thought.
[25 Sep 2018 3:15]
Patrick Galbraith
sorry for the accidental double-submit!
[25 Sep 2018 3:41]
Gianluca Borello
Thanks for your comment, but keep in mind that the problem you're trying to solve seems different to me, in a subtle but fundamental way: you want to make sure that Router is resilient to changing at runtime the number of members in the cluster, whereas I just need Router to be resilient to members of the clusters temporarily going offline. In my case the cluster members never goes down. Your condition introduces further complications because Router needs to have all the members listed in the configuration file in order for it to be able to properly adapt to traffic in case one of the bootstrap nodes goes down, otherwise you get complete failure even without Router restarting. I've done a deeper analysis of the issue here: https://github.com/mysql/mysql-docker/pull/8#issue-216382185 In particular, the relevant section is the one containing: """ However, it seems that all the cluster state is always discovered only via the bootstrap nodes. This means that if the original node discovered during the bootstrap goes down, the entire thing goes down, Router stops serving all the requests, even if we have a valid cluster formed by mysql2 and mysql3: """ It might make sense to separate the two things, the limitation you are facing seems more of an intrinsic one inside Router, mine seems more specific to the Docker image scripts. Let me know what you think and if I missed something. Thanks
[25 Sep 2018 4:34]
MySQL Verification Team
Hello Gianluca, Thank you for the report and contribution. regards, Umesh
[2 Dec 2019 9:19]
Terje Røsten
Contribution pushed as part of: https://github.com/mysql/mysql-docker/commit/47e733be8dd1944e6cb70ba5f2c8960604b9d838 some time ago.