The apiserver image (v1.21.0-rc.0) needs CAP_NET_BIND_SERVICE #100914

mcorbin · 2021-04-08T08:37:07Z

What happened:

I tried to run the apiserver image for v1.21.0-rc.0 (k8s.gcr.io/kube-apiserver:v1.21.0-rc.0) on top of another Kubernetes cluster. I took a working manifest which was using v.1.20.5 and was working without issues, and changed the image version.

The apiserver failed to start with standard_init_linux.go:219: exec user process caused: operation not permitted

What you expected to happen:

I was expecting to have a working apiserver.

How to reproduce it (as minimally and precisely as possible):

Trying to run the apiserver on top of kubernetes with a restricted pod security policy and without adding capabilities to my pod.

Anything else we need to know?:

I managed to make it work by adding a CAP_NET_BIND_SERVICE capability to my apiserver pod.

I found nothing related to this change in the changelog. I found this commit (#99145) which add RUN setcap cap_net_bind_service=+ep /${BINARY} to the apiserver image. and I think it can be the source of the issue.
I'm surprised by this change, which seems to be done for a specific use case but will, if I'm not wrong, impact everyone using the image.

Environment:

Kubernetes version (use kubectl version): v1.21.0-rc.0
Cloud provider or hardware configuration: Exoscale
OS (e.g: cat /etc/os-release): 20.04.1 LTS (Focal Fossa)
Kernel (e.g. uname -a): 5.4.0-62-generic
Install tools: A deployment on top of another Kubernetes cluster.

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2021-04-08T08:37:14Z

@mcorbin: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

mcorbin · 2021-04-08T08:50:29Z

/sig release

invidian · 2021-04-09T12:11:18Z

I can confirm this. Adding the following to kube-apiserver configuration makes it work again:

diff --git templates/deployment.yaml templates/deployment.yaml
index b1d4018..1663087 100644
--- templates/deployment.yaml
+++ templates/deployment.yaml
@@ -129,6 +129,10 @@ spec:
         - name: secrets
           mountPath: /etc/kubernetes/pki
           readOnly: true
+        securityContext:
+          capabilities:
+            add:
+               - NET_BIND_SERVICE
         resources:
           requests:
             cpu: 300m
diff --git templates/podsecuritypolicy.yaml templates/podsecuritypolicy.yaml
index 2913957..9135f3a 100644
--- templates/podsecuritypolicy.yaml
+++ templates/podsecuritypolicy.yaml
@@ -8,6 +8,8 @@ metadata:
 spec:
   privileged: false
   allowPrivilegeEscalation: false
+  allowedCapabilities:
+    - NET_BIND_SERVICE
   requiredDropCapabilities:
     - ALL
   volumes:

But it seems this is an intentional change to run kube-apiserver as non-root user. I guess this should definitely be mentioned in the changelog at least...

dims · 2021-04-09T12:23:29Z

@invidian @mcorbin Looks like the changes in #99145 (and adjustments we made in our harness #96134) did not make it into release notes

cc @vinayakankugoyal @liggitt

mcorbin · 2021-04-09T12:25:37Z

But it seems this is an intentional change to run kube-apiserver as non-root user. I guess this should definitely be mentioned in the changelog at least...

It was already possible to run the apiserver as non root on a non-privileged port. I'm not sure to understand why it was added to the default image.
I understand that some users may want to listen on privileged ports, but I don't see why the capability should be added by default.

vinayakankugoyal · 2021-04-09T14:53:02Z

That's odd! Unless you are binding to a privileged port you wouldn't need to add NET_BIND_SERVICE. From the deployment yaml above it is not clear what port you are binding to.

invidian · 2021-04-09T14:55:09Z

That's odd! Unless you are binding to a privileged port you wouldn't need to add NET_BIND_SERVICE. From the deployment yaml above it is not clear what port you are binding to.

IIUC my PSP drops all capabilities for the container, maybe this is the issue. I use standard port 6443 and my container actually still runs as root.

mcorbin · 2021-04-09T15:06:06Z

IIUC my PSP drops all capabilities for the container, maybe this is the issue. I use standard port 6443 and my container actually still runs as root.

Same for me, I drop all capabilities by default.

Here is how I run the apiserver for example for 1.21, with this configuration it works (I'm just redoing my tests):

--secure-port=30395

        securityContext:
          capabilities:
            add:
            - CAP_NET_BIND_SERVICE
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1291

Now, I reapply my manifest without the capabilities section in securityContext. The pod is in error with standard_init_linux.go:219: exec user process caused: operation not permitted

As I said, I don't have any issue with 1.20.5. I've just tested the same manifest (without the capabilities) with 1.20.5, and the apiserver starts as expected.

dims · 2021-04-18T20:19:31Z

@mcorbin So missing documentation then? (for the new behavior)

mcorbin · 2021-04-19T07:27:54Z

@mcorbin So missing documentation then? (for the new behavior)

Yes, but I would really like to know if it is the expected behavior. Maybe the old behavior could be restored ?

justaugustus · 2021-05-11T19:46:57Z

/help
/area release-eng
/priority critical-urgent
cc: @kubernetes/release-engineering

k8s-ci-robot · 2021-05-11T19:46:59Z

@justaugustus:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help
/area release-eng
/priority critical-urgent
cc: @kubernetes/release-engineering

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

liggitt · 2021-05-11T19:51:25Z

I don't think a single image can 1) run as non root, 2) allow binding to 443, 3) not require CAP_NET_BIND_SERVICE

liggitt · 2021-05-11T19:53:22Z

Running as non-root was added in 1.21 to begin to reduce API server permissions

I'm not sure how to simultaneously satisfy people running the image on port 443 and people dropping all capabilities

mcorbin · 2021-05-11T20:28:53Z

Running as non-root was added in 1.21 to begin to reduce API server permissions

I understand that, but it was already possible by not running the API server on port 443. It seems weird to me to have the capability as default.
For me, if someone wants to add capabilities, the image should be rebuild/extended on their side. Or else, we would have to add CAP_NET_BIND_SERVICE to all images for all components/tools, because you can always find someone in the world who will want to run things on a privileged port.

pacoxu · 2021-06-08T14:47:59Z

I don't think a single image can 1) run as non root, 2) allow binding to 443, 3) not require CAP_NET_BIND_SERVICE

Can it be solved by #102612 with sysctl net/ipv4/ip_unprivileged_port_start supported?

BenTheElder · 2021-06-23T09:11:28Z

For me, if someone wants to add capabilities, the image should be rebuild/extended on their side

The Kubernetes project is working on running this image as non-root in all our tools. There is a KEP related to this.

For the core images we do want to enable running as non-root.

EDIT: see: #7961, kubernetes/enhancements#2568

mcorbin · 2021-06-23T09:18:58Z

For the core images we do want to enable running as non-root.

But it was already possible on non-privileged ports.
Why should the image have by default the capability ? Why not:

Run things on non privileged ports
Rebuild/extend the image if you really need to run on privileged ports

For me base images should have the as few permissions as possible, then it's up to people to add what they need for their specific use cases.

BenTheElder · 2021-06-23T10:13:07Z

In this case the use case is kube-up (used for kubernetes CI) and kubeadm (officially supported first-party installation tool), not "people". Details of going rootless for the control plane are covered in the KEP link.

I don't think this is an unreasonable thing to have on the api-server binary, but cc @kubernetes/sig-security

You are equally able to build an image without this capability for your own deployment system if you wish.

invidian · 2021-06-23T10:25:18Z

You are equally able to build an image without this capability for your own deployment system if you wish.

But this does not follow least-privilege principle then, it's not equal situation IMO.

mcorbin · 2021-06-23T10:25:34Z

You are equally able to build an image without this capability for your own deployment system if you wish

it should be noted that this change was, as I said in my first message, not announced in the changelog althought it was a breaking change for people running the apiserver with all capabilities dropped.

A lot of people are running kubernetes control planes as non root for years (it's my case), the default apiserver port is even 6443 which is not privileged.

The container community spent years advocating for "base images with the fewest things enabled by default, if you need more extend it", but it seems the rule does not apply anymore in the kubernetes world (the exact same thing was done in metrics-server, but at least it was in the changelog this time).

vinayakankugoyal · 2021-06-23T13:52:31Z

Although any image can be used as a base image, that is not the intention with the kube-apiserver image. This change allows running kube-apiserver as non-root regardless of what port you bind to.
We are working on removing the need to apply file capabilities through kubernetes/enhancements#2757.

palnabarun · 2021-06-30T17:29:48Z

Is this still a release blocker for 1.22?

palnabarun · 2021-06-30T18:14:02Z

I feel from the issue and the subsequent comments that there are two things:

Documenting in the CHANGELOG that this change (Apply cap_net_bind_service to kube-apiserver binary. #99145) was made.
Discussing whether adding the capability is justified or not. It feels like this concern should be discussed in the context of the KEP (Run control-plane as non-root in kubeadm. enhancements#2568) and Run services as non-privileged users #7961.

We need to certainly work on 1 in v1.22.

puerco · 2021-06-30T18:23:53Z

I agree with @palnabarun, the change should have been reflected in the release notes and we missed it. But this is part of the expected changes from the KEP and not a bug. In fact, the KEP explicitly states that the kube-apiserver image would lose all caps except for cap_net_bind_service.

The keep was open for review since march and that was where the discussion should have taken place. At this point, I would say this is not a bug, much less release-blocking.

pacoxu · 2021-07-01T00:55:12Z

After #103326 is merged, we can add the safe Sysctl to the pod.

See below ⬇️ , this comment may be not the correct solution.

vinayakankugoyal · 2021-07-01T00:57:09Z

No we can't, since it does not run in its own network namespace.

savitharaghunathan · 2021-07-07T15:25:24Z

@palnabarun @puerco : checking in to see if this is still a release blocking issue. If not, can we remove the release-blocker label?

savitharaghunathan · 2021-07-07T17:33:32Z

Tagging @justaugustus for his input as well

saschagrunert · 2021-07-08T07:21:26Z

Yes, let's remove the release blocker label from this one.

voigt · 2021-07-20T08:14:39Z

Hi there,
Bug-Triage here: given the current state of v1.22 release, does it make sense to move this issue to milestone v1.23 (or even clear the milestone)?

dims · 2021-07-22T18:34:03Z

I don't see this making any progress in 1.22, let's move it out please

/milestone v1.23

jyotimahapatra · 2021-09-22T03:09:26Z

Hi 👋 I'm the bug triage shadow. I wanted to nudge the conversation again in the context of 1.23 release. Do we need to follow this up in 1.23?

pacoxu · 2021-09-22T03:42:07Z

Apply cap_net_bind_service to kube-apiserver binary. #99145 (apiserver image dockerfile) was merged in 1.21.
Run kube-apiserver as non-root. #96134 cluster/gce support for non-root apiserver was merged in 1.21.
kubeadm init --feature-gates=RootlessControlPlane=true was supported after 1.22

I think we can close this issue.

kubernetes/enhancements#2568 is for tracking kubeadm support rootless control plane. (It is alpha now and will be beta in 1.24)

voigt · 2021-10-28T20:05:25Z

Closing the issue as suggested by @pacoxu.

/close

k8s-ci-robot · 2021-10-28T20:05:47Z

@voigt: Closing this issue.

In response to this:

Closing the issue as suggested by @pacoxu.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

mcorbin added the kind/bug Categorizes issue or PR as related to a bug. label Apr 8, 2021

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Apr 8, 2021

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 8, 2021

k8s-ci-robot added sig/release Categorizes an issue or PR as relevant to SIG Release. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 8, 2021

justaugustus added the release-blocker label May 11, 2021

justaugustus added this to the v1.22 milestone May 11, 2021

saschagrunert removed the release-blocker label Jul 8, 2021

saschagrunert added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Jul 9, 2021

k8s-ci-robot modified the milestones: v1.22, v1.23 Jul 22, 2021

k8s-ci-robot closed this as completed Oct 28, 2021

The apiserver image (v1.21.0-rc.0) needs CAP_NET_BIND_SERVICE #100914

The apiserver image (v1.21.0-rc.0) needs CAP_NET_BIND_SERVICE #100914

Comments

mcorbin commented Apr 8, 2021 • edited Loading

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

k8s-ci-robot commented Apr 8, 2021

mcorbin commented Apr 8, 2021

invidian commented Apr 9, 2021 • edited Loading

dims commented Apr 9, 2021

mcorbin commented Apr 9, 2021

vinayakankugoyal commented Apr 9, 2021

invidian commented Apr 9, 2021

mcorbin commented Apr 9, 2021 • edited Loading

dims commented Apr 18, 2021 • edited Loading

mcorbin commented Apr 19, 2021

justaugustus commented May 11, 2021

k8s-ci-robot commented May 11, 2021

liggitt commented May 11, 2021

liggitt commented May 11, 2021

mcorbin commented May 11, 2021

pacoxu commented Jun 8, 2021

BenTheElder commented Jun 23, 2021 • edited Loading

mcorbin commented Jun 23, 2021

BenTheElder commented Jun 23, 2021

invidian commented Jun 23, 2021

mcorbin commented Jun 23, 2021 • edited Loading

vinayakankugoyal commented Jun 23, 2021

palnabarun commented Jun 30, 2021

palnabarun commented Jun 30, 2021 • edited Loading

puerco commented Jun 30, 2021

pacoxu commented Jul 1, 2021 • edited Loading

vinayakankugoyal commented Jul 1, 2021

savitharaghunathan commented Jul 7, 2021 • edited Loading

savitharaghunathan commented Jul 7, 2021

saschagrunert commented Jul 8, 2021

voigt commented Jul 20, 2021

dims commented Jul 22, 2021

jyotimahapatra commented Sep 22, 2021

pacoxu commented Sep 22, 2021

voigt commented Oct 28, 2021

k8s-ci-robot commented Oct 28, 2021

mcorbin commented Apr 8, 2021 •

edited

Loading

invidian commented Apr 9, 2021 •

edited

Loading

mcorbin commented Apr 9, 2021 •

edited

Loading

dims commented Apr 18, 2021 •

edited

Loading

BenTheElder commented Jun 23, 2021 •

edited

Loading

mcorbin commented Jun 23, 2021 •

edited

Loading

palnabarun commented Jun 30, 2021 •

edited

Loading

pacoxu commented Jul 1, 2021 •

edited

Loading

savitharaghunathan commented Jul 7, 2021 •

edited

Loading