| Bug #110821 | MySQL Operator: Expose Router Service as LoadBalancer doesnt work | ||
|---|---|---|---|
| Submitted: | 26 Apr 2023 15:39 | Modified: | 18 Jun 2023 1:32 |
| Reporter: | Christopher Feldhues | Email Updates: | |
| Status: | No Feedback | Impact on me: | |
| Category: | MySQL Operator | Severity: | S3 (Non-critical) |
| Version: | 8.0.32 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[17 May 2023 23:12]
MySQL Verification Team
Hi, Thank you for your interest in MySQL. I am not 100% sure this is a bug but I'm too having same issues so I'll see if we can either fix the problem or get the documentation to better explain how to solve this.
[17 May 2023 23:41]
Johannes Schlüter
Posted by developer:
The error indicates that something is connecting to the port, which doesn't use the MySQL protocol. In consequence MySQL blocks the connection.
Could it be that your load balancer has some form of keep-alive check enabled?
As reference: This works for me:
apiVersion: v1
kind: Service
metadata:
name: mynodeport
namespace: demo
spec:
ports:
- name: mysql
nodePort: 30134
port: 3306
protocol: TCP
targetPort: 6446
- name: mysql-ro
nodePort: 32493
port: 6447
protocol: TCP
targetPort: 6447
selector:
component: mysqlrouter
mysql.oracle.com/cluster: mywithmetric
tier: mysql
type: LoadBalancer
[19 Jun 2023 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[1 Jul 2023 12:18]
Rui Mao
This seems like an old known issue Bug #90809, and I got it too. It seenms the solution was just documented and no codes changed. But I cannot agree with it. There are at least two situations could cause critical problems. 1. the application may not connect to LB for a long time, and the max error count reached, which will cause the application got into trouble. 2. Some types of LB has two private IPs as the source of connections, and both are actively checking healthy. In general one IP is primary and the other is backup, and all traffics are passed by primary IP. The backup IP will be never used if LB doesn't failover to it. So the backup IP will be eventually blocked by mysql router no matter what the max_connect_errors is. As you can imagine, if the LB got any problem and failover to backup IP, the application cannot reach database and a disaster happened. My suggestion would be a white list of LB IPs. The actions like healthy check from these IPs will be ignored and doesn't count into connection error.

Description: Hey! I want to deploy a normal InnoDBCluster to K8s. Most of our applications doesn't run on Kubernetes, so I want to make the Service reachable from outside the cluster. My idea was to create a Service of Type "LoadBalancer", because the team in our house that provides the k8s cluster supports external LoadBalancer. When I do this, I get these errors in my MySQL Router: 2023-04-26 15:33:28 routing INFO [7f4d11dbd700] [routing:bootstrap_rw] incrementing error counter for host of 100.96.4.1:60381 (now 40) ││ 2023-04-26 15:33:28 routing INFO [7f4d115bc700] [routing:bootstrap_rw] 100.96.3.1:44751 closed connection before finishing handshake ││ 2023-04-26 15:33:28 routing INFO [7f4d115bc700] [routing:bootstrap_rw] incrementing error counter for host of 100.96.3.1:44751 (now 42) ││ 2023-04-26 15:33:28 routing INFO [7f4d11dbd700] [routing:bootstrap_rw] 100.96.2.1:9061 closed connection before finishing handshake ││ 2023-04-26 15:33:28 routing INFO [7f4d11dbd700] [routing:bootstrap_rw] incrementing error counter for host of 100.96.2.1:9061 (now 41) ││ 2023-04-26 15:33:28 routing INFO [7f4d11dbd700] [routing:bootstrap_rw] 100.96.5.1:26197 closed connection before finishing handshake ││ 2023-04-26 15:33:28 routing INFO [7f4d11dbd700] [routing:bootstrap_rw] incrementing error counter for host of 100.96.5.1:26197 (now 40) ││ 2023-04-26 15:33:28 routing INFO [7f4d115bc700] [routing:bootstrap_rw] 100.96.1.1:36121 closed connection before finishing handshake ││ 2023-04-26 15:33:28 routing INFO [7f4d11dbd700] [routing:bootstrap_rw] 100.96.4.1:24957 closed connection before finishing handshake ││ 2023-04-26 15:33:28 routing INFO [7f4d11dbd700] [routing:bootstrap_rw] incrementing error counter for host of 100.96.4.1:24957 (now 41) ││ 2023-04-26 15:33:28 routing INFO [7f4d115bc700] [routing:bootstrap_rw] incrementing error counter for host of 100.96.1.1:36121 (now 41) ││ 2023-04-26 15:33:28 routing INFO [7f4d115bc700] [routing:bootstrap_rw] 100.96.0.1:34387 closed connection before finishing handshake ││ 2023-04-26 15:33:28 routing INFO [7f4d115bc700] [routing:bootstrap_rw] incrementing error counter for host of 100.96.0.1:34387 (now 42) ││ 2023-04-26 15:33:28 routing INFO [7f4d115bc700] [routing:bootstrap_rw] 100.96.3.1:3065 closed connection before finishing handshake ││ 2023-04-26 15:33:28 routing INFO [7f4d115bc700] [routing:bootstrap_rw] incrementing error counter for host of 100.96.3.1:3065 (now 43) ││ 2023-04-26 15:33:28 routing INFO [7f4d11dbd700] [routing:bootstrap_rw] 100.96.2.1:43190 closed connection before finishing handshake ││ 2023-04-26 15:33:28 routing INFO [7f4d11dbd700] [routing:bootstrap_rw] incrementing error counter for host of 100.96.2.1:43190 (now 42) ││ 2023-04-26 15:33:28 routing INFO [7f4d115bc700] [routing:bootstrap_rw] 100.96.5.1:45158 closed connection before finishing handshake ││ 2023-04-26 15:33:28 routing INFO [7f4d115bc700] [routing:bootstrap_rw] incrementing error counter for host of 100.96.5.1:45158 (now 41) ││ 2023-04-26 15:33:28 routing INFO [7f4d11dbd700] [routing:bootstrap_rw] 100.96.1.1:4315 closed connection before finishing handshake ││ 2023-04-26 15:33:28 routing INFO [7f4d11dbd700] [routing:bootstrap_rw] incrementing error counter for host of 100.96.1.1:4315 (now 42) Simultaneously the primary server prints: 2023-04-26T15:33:28.901869Z 14043 [Note] [MY-010914] [Server] Bad handshake ││ 2023-04-26T15:33:28.902103Z 14045 [Note] [MY-010914] [Server] Bad handshake ││ 2023-04-26T15:33:30.107302Z 14049 [Note] [MY-010914] [Server] Bad handshake ││ 2023-04-26T15:33:30.122941Z 14050 [Note] [MY-010914] [Server] Bad handshake ││ 2023-04-26T15:33:31.956024Z 14056 [Note] [MY-010914] [Server] Bad handshake ││ 2023-04-26T15:33:32.443513Z 14057 [Note] [MY-010914] [Server] Bad handshake ││ 2023-04-26T15:33:34.971282Z 14063 [Note] [MY-010914] [Server] Bad handshake ││ 2023-04-26T15:33:35.203371Z 14065 [Note] [MY-010914] [Server] Bad handshake When i change the service back to ClusterIP the error disappears. How to repeat: apiVersion: mysql.oracle.com/v2 kind: InnoDBCluster metadata: name: mysql-test-db-cluster namespace: mysql-test-db spec: secretName: mysql-test-db-secret tlsUseSelfSigned: true instances: 2 version: 8.0.32 router: instances: 1 version: 8.0.32 datadirVolumeClaimTemplate: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi backupProfiles: - name: myfancyprofile # Embedded backup profile dumpInstance: # MySQL Shell Dump storage: persistentVolumeClaim: claimName: myexample-pvc # store to this pre-existing PVC backupSchedules: - name: mygreatschedule schedule: "0 0 * * *" # Daily, at midnight backupProfileName: myfancyprofile # reference the desired backupProfiles's name enabled: true # backup schedules can be temporarily disabled --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: myexample-pvc spec: storageClassName: default accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- apiVersion: v1 kind: Service metadata: name: mysql-test-db-cluster-outgoing namespace: mysql-test-db annotations: external-dns.alpha.kubernetes.io/hostname: mysql-test-db.dbs-systemtest-t.k8s.lvm.de spec: type: LoadBalancer ports: - name: mysql port: 3306 protocol: TCP targetPort: 6446 selector: component: mysqlrouter mysql.oracle.com/cluster: mysql-test-db-cluster tier: mysql