Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The apiserver image (v1.21.0-rc.0) needs CAP_NET_BIND_SERVICE #100914

Closed
mcorbin opened this issue Apr 8, 2021 · 36 comments
Closed

The apiserver image (v1.21.0-rc.0) needs CAP_NET_BIND_SERVICE #100914

mcorbin opened this issue Apr 8, 2021 · 36 comments
Labels
area/release-eng Issues or PRs related to the Release Engineering subproject help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/release Categorizes an issue or PR as relevant to SIG Release.
Milestone

Comments

@mcorbin
Copy link

mcorbin commented Apr 8, 2021

What happened:

I tried to run the apiserver image for v1.21.0-rc.0 (k8s.gcr.io/kube-apiserver:v1.21.0-rc.0) on top of another Kubernetes cluster. I took a working manifest which was using v.1.20.5 and was working without issues, and changed the image version.

The apiserver failed to start with standard_init_linux.go:219: exec user process caused: operation not permitted

What you expected to happen:

I was expecting to have a working apiserver.

How to reproduce it (as minimally and precisely as possible):

Trying to run the apiserver on top of kubernetes with a restricted pod security policy and without adding capabilities to my pod.

Anything else we need to know?:

I managed to make it work by adding a CAP_NET_BIND_SERVICE capability to my apiserver pod.

I found nothing related to this change in the changelog. I found this commit (#99145) which add RUN setcap cap_net_bind_service=+ep /${BINARY} to the apiserver image. and I think it can be the source of the issue.
I'm surprised by this change, which seems to be done for a specific use case but will, if I'm not wrong, impact everyone using the image.

Environment:

  • Kubernetes version (use kubectl version): v1.21.0-rc.0
  • Cloud provider or hardware configuration: Exoscale
  • OS (e.g: cat /etc/os-release): 20.04.1 LTS (Focal Fossa)
  • Kernel (e.g. uname -a): 5.4.0-62-generic
  • Install tools: A deployment on top of another Kubernetes cluster.
@mcorbin mcorbin added the kind/bug Categorizes issue or PR as related to a bug. label Apr 8, 2021
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Apr 8, 2021
@k8s-ci-robot
Copy link
Contributor

@mcorbin: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 8, 2021
@mcorbin
Copy link
Author

mcorbin commented Apr 8, 2021

/sig release

@k8s-ci-robot k8s-ci-robot added sig/release Categorizes an issue or PR as relevant to SIG Release. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 8, 2021
@invidian
Copy link
Member

invidian commented Apr 9, 2021

I can confirm this. Adding the following to kube-apiserver configuration makes it work again:

diff --git templates/deployment.yaml templates/deployment.yaml
index b1d4018..1663087 100644
--- templates/deployment.yaml
+++ templates/deployment.yaml
@@ -129,6 +129,10 @@ spec:
         - name: secrets
           mountPath: /etc/kubernetes/pki
           readOnly: true
+        securityContext:
+          capabilities:
+            add:
+               - NET_BIND_SERVICE
         resources:
           requests:
             cpu: 300m
diff --git templates/podsecuritypolicy.yaml templates/podsecuritypolicy.yaml
index 2913957..9135f3a 100644
--- templates/podsecuritypolicy.yaml
+++ templates/podsecuritypolicy.yaml
@@ -8,6 +8,8 @@ metadata:
 spec:
   privileged: false
   allowPrivilegeEscalation: false
+  allowedCapabilities:
+    - NET_BIND_SERVICE
   requiredDropCapabilities:
     - ALL
   volumes:

But it seems this is an intentional change to run kube-apiserver as non-root user. I guess this should definitely be mentioned in the changelog at least...

@dims
Copy link
Member

dims commented Apr 9, 2021

@invidian @mcorbin Looks like the changes in #99145 (and adjustments we made in our harness #96134) did not make it into release notes

cc @vinayakankugoyal @liggitt

@mcorbin
Copy link
Author

mcorbin commented Apr 9, 2021

But it seems this is an intentional change to run kube-apiserver as non-root user. I guess this should definitely be mentioned in the changelog at least...

It was already possible to run the apiserver as non root on a non-privileged port. I'm not sure to understand why it was added to the default image.
I understand that some users may want to listen on privileged ports, but I don't see why the capability should be added by default.

@vinayakankugoyal
Copy link
Contributor

That's odd! Unless you are binding to a privileged port you wouldn't need to add NET_BIND_SERVICE. From the deployment yaml above it is not clear what port you are binding to.

@invidian
Copy link
Member

invidian commented Apr 9, 2021

That's odd! Unless you are binding to a privileged port you wouldn't need to add NET_BIND_SERVICE. From the deployment yaml above it is not clear what port you are binding to.

IIUC my PSP drops all capabilities for the container, maybe this is the issue. I use standard port 6443 and my container actually still runs as root.

@mcorbin
Copy link
Author

mcorbin commented Apr 9, 2021

IIUC my PSP drops all capabilities for the container, maybe this is the issue. I use standard port 6443 and my container actually still runs as root.

Same for me, I drop all capabilities by default.

Here is how I run the apiserver for example for 1.21, with this configuration it works (I'm just redoing my tests):

--secure-port=30395
        securityContext:
          capabilities:
            add:
            - CAP_NET_BIND_SERVICE
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1291

Now, I reapply my manifest without the capabilities section in securityContext. The pod is in error with standard_init_linux.go:219: exec user process caused: operation not permitted

As I said, I don't have any issue with 1.20.5. I've just tested the same manifest (without the capabilities) with 1.20.5, and the apiserver starts as expected.

@dims
Copy link
Member

dims commented Apr 18, 2021

@mcorbin So missing documentation then? (for the new behavior)

@mcorbin
Copy link
Author

mcorbin commented Apr 19, 2021

@mcorbin So missing documentation then? (for the new behavior)

Yes, but I would really like to know if it is the expected behavior. Maybe the old behavior could be restored ?

@justaugustus
Copy link
Member

/help
/area release-eng
/priority critical-urgent
cc: @kubernetes/release-engineering

@k8s-ci-robot
Copy link
Contributor

@justaugustus:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help
/area release-eng
/priority critical-urgent
cc: @kubernetes/release-engineering

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added area/release-eng Issues or PRs related to the Release Engineering subproject priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels May 11, 2021
@justaugustus justaugustus added this to the v1.22 milestone May 11, 2021
@liggitt
Copy link
Member

liggitt commented May 11, 2021

I don't think a single image can 1) run as non root, 2) allow binding to 443, 3) not require CAP_NET_BIND_SERVICE

@liggitt
Copy link
Member

liggitt commented May 11, 2021

Running as non-root was added in 1.21 to begin to reduce API server permissions

I'm not sure how to simultaneously satisfy people running the image on port 443 and people dropping all capabilities

@mcorbin
Copy link
Author

mcorbin commented May 11, 2021

Running as non-root was added in 1.21 to begin to reduce API server permissions

I understand that, but it was already possible by not running the API server on port 443. It seems weird to me to have the capability as default.
For me, if someone wants to add capabilities, the image should be rebuild/extended on their side. Or else, we would have to add CAP_NET_BIND_SERVICE to all images for all components/tools, because you can always find someone in the world who will want to run things on a privileged port.

@pacoxu
Copy link
Member

pacoxu commented Jun 8, 2021

I don't think a single image can 1) run as non root, 2) allow binding to 443, 3) not require CAP_NET_BIND_SERVICE

Can it be solved by #102612 with sysctl net/ipv4/ip_unprivileged_port_start supported?

@BenTheElder
Copy link
Member

BenTheElder commented Jun 23, 2021

For me, if someone wants to add capabilities, the image should be rebuild/extended on their side

The Kubernetes project is working on running this image as non-root in all our tools. There is a KEP related to this.

For the core images we do want to enable running as non-root.

EDIT: see: #7961, kubernetes/enhancements#2568

@mcorbin
Copy link
Author

mcorbin commented Jun 23, 2021

For the core images we do want to enable running as non-root.

But it was already possible on non-privileged ports.
Why should the image have by default the capability ? Why not:

  • Run things on non privileged ports
  • Rebuild/extend the image if you really need to run on privileged ports

For me base images should have the as few permissions as possible, then it's up to people to add what they need for their specific use cases.

@BenTheElder
Copy link
Member

In this case the use case is kube-up (used for kubernetes CI) and kubeadm (officially supported first-party installation tool), not "people". Details of going rootless for the control plane are covered in the KEP link.

I don't think this is an unreasonable thing to have on the api-server binary, but cc @kubernetes/sig-security

You are equally able to build an image without this capability for your own deployment system if you wish.

@invidian
Copy link
Member

You are equally able to build an image without this capability for your own deployment system if you wish.

But this does not follow least-privilege principle then, it's not equal situation IMO.

@mcorbin
Copy link
Author

mcorbin commented Jun 23, 2021

You are equally able to build an image without this capability for your own deployment system if you wish

it should be noted that this change was, as I said in my first message, not announced in the changelog althought it was a breaking change for people running the apiserver with all capabilities dropped.

A lot of people are running kubernetes control planes as non root for years (it's my case), the default apiserver port is even 6443 which is not privileged.

The container community spent years advocating for "base images with the fewest things enabled by default, if you need more extend it", but it seems the rule does not apply anymore in the kubernetes world (the exact same thing was done in metrics-server, but at least it was in the changelog this time).

@vinayakankugoyal
Copy link
Contributor

Although any image can be used as a base image, that is not the intention with the kube-apiserver image. This change allows running kube-apiserver as non-root regardless of what port you bind to.
We are working on removing the need to apply file capabilities through kubernetes/enhancements#2757.

@palnabarun
Copy link
Member

Is this still a release blocker for 1.22?

@palnabarun
Copy link
Member

palnabarun commented Jun 30, 2021

I feel from the issue and the subsequent comments that there are two things:

  1. Documenting in the CHANGELOG that this change (Apply cap_net_bind_service to kube-apiserver binary. #99145) was made.
  2. Discussing whether adding the capability is justified or not. It feels like this concern should be discussed in the context of the KEP (Run control-plane as non-root in kubeadm. enhancements#2568) and Run services as non-privileged users #7961.

We need to certainly work on 1 in v1.22.

@puerco
Copy link
Member

puerco commented Jun 30, 2021

I agree with @palnabarun, the change should have been reflected in the release notes and we missed it. But this is part of the expected changes from the KEP and not a bug. In fact, the KEP explicitly states that the kube-apiserver image would lose all caps except for cap_net_bind_service.

The keep was open for review since march and that was where the discussion should have taken place. At this point, I would say this is not a bug, much less release-blocking.

@pacoxu
Copy link
Member

pacoxu commented Jul 1, 2021

After #103326 is merged, we can add the safe Sysctl to the pod.

See below ⬇️ , this comment may be not the correct solution.

@vinayakankugoyal
Copy link
Contributor

No we can't, since it does not run in its own network namespace.

@savitharaghunathan
Copy link
Member

savitharaghunathan commented Jul 7, 2021

@palnabarun @puerco : checking in to see if this is still a release blocking issue. If not, can we remove the release-blocker label?

@savitharaghunathan
Copy link
Member

Tagging @justaugustus for his input as well

@saschagrunert
Copy link
Member

Yes, let's remove the release blocker label from this one.

@saschagrunert saschagrunert added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Jul 9, 2021
@voigt
Copy link

voigt commented Jul 20, 2021

Hi there,
Bug-Triage here: given the current state of v1.22 release, does it make sense to move this issue to milestone v1.23 (or even clear the milestone)?

@dims
Copy link
Member

dims commented Jul 22, 2021

I don't see this making any progress in 1.22, let's move it out please

/milestone v1.23

@k8s-ci-robot k8s-ci-robot modified the milestones: v1.22, v1.23 Jul 22, 2021
@jyotimahapatra
Copy link
Contributor

Hi 👋 I'm the bug triage shadow. I wanted to nudge the conversation again in the context of 1.23 release. Do we need to follow this up in 1.23?

@pacoxu
Copy link
Member

pacoxu commented Sep 22, 2021

I think we can close this issue.

kubernetes/enhancements#2568 is for tracking kubeadm support rootless control plane. (It is alpha now and will be beta in 1.24)

@voigt
Copy link

voigt commented Oct 28, 2021

Closing the issue as suggested by @pacoxu.

/close

@k8s-ci-robot
Copy link
Contributor

@voigt: Closing this issue.

In response to this:

Closing the issue as suggested by @pacoxu.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/release-eng Issues or PRs related to the Release Engineering subproject help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/release Categorizes an issue or PR as relevant to SIG Release.
Projects
None yet
Development

No branches or pull requests