Bug #108196 mysql-operator is unable to start pods by PodSecurityPolicy
Submitted: 19 Aug 2022 9:43 Modified: 14 Dec 2022 22:05
Reporter: Günter Prossliner Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Operator Severity:S3 (Non-critical)
Version:mysql-operator:2.0.5 OS:Any
Assigned to: CPU Architecture:Any

[19 Aug 2022 9:43] Günter Prossliner
Description:
I'm evaluating to use the mysql-operator to setup 'ImmoDbCluster' objects on a shared Kubernetes cluster.

We have some PodSecurityPolicies in place, so not all capabilities are available for user workloads.

If I create a `ImmoDbCluster` object like documented at https://dev.mysql.com/doc/mysql-operator/en/mysql-operator-innodbcluster-simple-kubectl.ht..., the StatefulSet is getting created, but it fails to create any pods with the following events:

'create Pod mycluster-0 in StatefulSet mycluster failed error: pods "mycluster-0" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.initContainers[2].securityContext.capabilities.add: Invalid value: "DAC_OVERRIDE": capability may not be added spec.initContainers[2].securityContext.capabilities.add: Invalid value: "SETGID": capability may not be added'

By checking the generated manifest, the following capabilities are added:

        securityContext:
          capabilities:
            add:
            - DAC_OVERRIDE
            - SETGID
            - SETUID
            - SYS_NICE
            - SYS_RESOURCE

If the `InnoDbCluster` object is created in a system namespace, where no PodSecurityPolicy is configured, it seems to work ok.

Granting SETGID/SETUID/DAC_OVERRIDE is like granting root, which is not possible 
with our Security Policy in place.

We already have multiple mysql & mariadb pods running (without using the mysql-operator), and none of them needs such powerful capabilities.

Is there a way to instruct the operator to be compilant with our Security Policy, which doesn't allow any of these capabilites?

How to repeat:
Using a cluster-admin account, install the operator as documented here: https://dev.mysql.com/doc/mysql-operator/en/mysql-operator-installation-helm.html

Using a cluster-admin account, apply the following PodSecurityPolicy to the default (=user) namespace:

spec:
  allowPrivilegeEscalation: false
  defaultAllowPrivilegeEscalation: false
  fsGroup:
    rule: RunAsAny
  runAsGroup:
    rule: RunAsAny
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  volumes:
  - emptyDir
  - secret
  - persistentVolumeClaim
  - downwardAPI
  - configMap
  - projected

Create a InnoDbCluster object in the default namespace like documented here: https://dev.mysql.com/doc/mysql-operator/en/mysql-operator-innodbcluster-simple-kubectl.ht...
[22 Aug 2022 16:17] MySQL Verification Team
Hi,

I do not personally think this is a bug but I will double check with our k8s expert. In the meantime, contacting MySQL Support would be the best way to solve the problem you are having.

Thanks for the report
[31 Aug 2022 12:35] Günter Prossliner
Thank you for your comment! The reason why I consider this as a (potential) bug is,

1. When using the offical docker mysql image (https://hub.docker.com/_/mysql), you don't have to grant those capabilities, it runs even without any additional capabilities set.

2. Even when the CAP_SETUID / CAP_SETGID is not granted at the host level, the container process still can use setuid(), but it's isolated in the container.
  
  # test setuid in container
  docker run python python -c "import os; print(os.getuid());os.setuid(1);print(os.getuid())"

There should be an option to customize / disable the capabilities requested by the operator.
[31 Aug 2022 13:00] Günter Prossliner
This is the commit when those changes have been introduced:

22-03-09 by ahorcom 

https://github.com/mysql/mysql-operator/commit/88c40d4032ecedbf197810203039bc4e68f3c6e4
[14 Dec 2022 22:05] Philip Olson
Posted by developer:
 
Fixed as of the upcoming MySQL Operator 8.0.32-2.0.8 release, and here's the proposed changelog entry from the documentation team:

Altered security context capabilities by changing the following
privileges from 'add' to 'drop': DAC_OVERRIDE, SETGID, SETUID, SYS_NICE,
and SYS_RESOURCE.

Thank you for the bug report.