Skip to content

Commit

Permalink
fix readme.md typos
Browse files Browse the repository at this point in the history
Signed-off-by: Your Name <hpotpose62@gmail.com>
  • Loading branch information
akagami-harsh committed Feb 7, 2025
1 parent 39eae63 commit debe838
Show file tree
Hide file tree
Showing 3 changed files with 23 additions and 196 deletions.
4 changes: 2 additions & 2 deletions common/networkpolicies/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
### 1. Why would a user apply the extra policies?
It is a second line of defence after Istio autorization policies and it protects pods and services that are not protected by Istio
It is a second line of defence after Istio authorization policies and it protects pods and services that are not protected by Istio.

### 2. Effects they will have in the cluster
Please consult the name of and comments in each networkpolicy for further information.

### 3. We should achieve the same with AuthorizationPolicies
But there are components, e.g. Katib that are not secured by istio
But there are components, e.g. Katib that are not secured by istio.
207 changes: 17 additions & 190 deletions common/oauth2-proxy/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
# Kubeflow Authentication using Oauth2 Proxy

## Istio Envoy Filter

> EnvoyFilter provides a mechanism to customize the Envoy configuration generated by Istio Pilot. Use EnvoyFilter to modify values for certain fields, add specific filters, or even add entirely new listeners, clusters, etc.[^1]
Kubeflow will use an Envoy Filter for every incoming request when is used
with `oidc-authservice`.

Usage of EnvoyFilter is currently not recommended. The preferred method for configuring External
Authentication in Istio is the `envoyExtAuthzHttp` extension provider[^2].

Envoy Filter is set up with [oidc-authservice](https://github.com/arrikto/oidc-authservice).

## Istio envoyExtAuthzHttp

This is Istio's recommended approach for External Authorization[^2]. It is not limited to the use
Expand All @@ -8,11 +20,11 @@ current and foreseeable authentication needs.

## Kubeflow Pipelines User and M2M Authentication and Authorization

The Kubeflow Pipelines component relies on the built-in kubernetes functionalities to authenticate and authorize
Kubeflow Pipelines component relies on the built-in kubernetes functionalities to authenticate and authorize
user requests, specifically the TokenReviews[^4] and SubjectAccessReview[^5].

The best way to describe how it works is to explain with an example. Lets analyze the flow
when a client calls the API to list the KF Pipeline runs:
when client calls API to list the KF Pipeline Runs:

1. api-server starts endpoints in:

Expand Down Expand Up @@ -64,194 +76,9 @@ when a client calls the API to list the KF Pipeline runs:
{} # empty response which is fine because no pipeline runs exist
```

### Authentication and Authorization analysis diagram for Kubeflow Pipelines
![Kubeflow Auth Diagram](./components/kubeflow_auth_diagram.svg)
### Auth analysis diagram for Kubeflow Pipelines

### Change the default authentication from "Dex + Oauth2-proxy" to "Oauth2-proxy" only

The authentication in Kubeflow evolved over time and we dropped envoyfilters and oidc-authservice in favor of RequestAuthentication and Oauth2-proxy in Kubeflow 1.9.
![auth-flow](components/oauth2-flow.svg)

You can adjust OAuth2 Proxy to directly connect to your own IDP(Identity Provider) suchg as GCP, [AWS](https://docs.aws.amazon.com/cognito/latest/developerguide/federation-endpoints-oauth-grants.html), Azure etc:

1. Create an application on your IdP (purple line)
2. Change your [OAuth2 Proxy issuer](https://github.com/kubeflow/manifests/blob/35539f162ea7fafc8c5035d8df0d8d8cf5a9d327/common/oauth2-proxy/base/oauth2-proxy-config.yaml#L10) to your IdP. Of course never ever directly, but with kustomize overlays and components.

Here is an example of patching oauth2-proxy to connect directly to Azure IDP and skip Dex.
This is enterprise integration so feel free to hire consultants or pay for commercial distributions if you need more help.
For example Azure returns rather large headers compared to other IDPs, so maybe you need to annotate the nginx-ingress to support that.

```
# based on https://github.com/kubeflow/manifests/blob/master/common/oauth2-proxy/base/oauth2_proxy.cfg
# and https://oauth2-proxy.github.io/oauth2-proxy/configuration/providers/azure/
apiVersion: v1
kind: ConfigMap
metadata:
name: oauth2-proxy
namespace: oauth2-proxy
data:
oauth2_proxy.cfg: |
provider = "oidc"
oidc_issuer_url = "https://login.microsoftonline.com/$MY_TENANT/v2.0"
scope = "openid email profile" # remove groups since they are not used yet
email_domains = [ "*" ]
# serve a static HTTP 200 upstream on for authentication success
# we are using oauth2-proxy as an ExtAuthz to "check" each request, not pass it on
upstreams = [ "static://200" ]
# skip authentication for these paths
skip_auth_routes = [
"^/dex/",
]
# requests to paths matching these regex patterns will receive a 401 Unauthorized response
# when not authenticated, instead of being redirected to the login page with a 302,
# this prevents background requests being redirected to the login page,
# and the accumulation of CSRF cookies
api_routes = [
# Generic
# NOTE: included because most background requests contain these paths
"/api/",
"/apis/",
# Kubeflow Pipelines
# NOTE: included because KFP UI makes MANY background requests to these paths but because they are
# not `application/json` requests, oauth2-proxy will redirect them to the login page
"^/ml_metadata",
]
skip_provider_button = true
set_authorization_header = true
set_xauthrequest = true
cookie_name = "oauth2_proxy_kubeflow"
cookie_expire = "24h"
cookie_refresh = "1h" # This improves the user experience a lot
redirect_url = "https://$MY_PUBLIC_KUBEFLOW_DOMAIN/oauth2/callback"
relative_redirect_url = false
```

3. In the istio-system namespace is a RequestAuthentication resource. You need to change its issuer to your own IdP, or even better create an additional one.

```
apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
name: azure-aad-requestauthentication
namespace: istio-system
spec:
# we only apply to the ingress-gateway because:
# - there is no need to verify the same tokens at each sidecar
# - having no selector will apply to the RequestAuthentication to ALL
# Pods in the mesh, even ones which are not part of Kubeflow
selector:
matchLabels:
app: istio-ingressgateway
jwtRules:
- issuer: https://login.microsoftonline.com/$MY_TENANT/v2.0
# `forwardOriginalToken` is not strictly required to be true.
# there are pros and cons to each value:
# - true: the original token is forwarded to the destination service
# which raises the risk of the token leaking
# - false: the original token is stripped from the request
# which will prevent the destination service from
# verifying the token (possibly with its own RequestAuthentication)
forwardOriginalToken: true
# This will unpack the JWTs issued by Dex or other IDPs into the expected headers.
# It is applied to BOTH the m2m tokens from outside the cluster (which skip
# oauth2-proxy because they already have a dex JWT), AND user requests which were
# authenticated by oauth2-proxy (which injected a dex JWT).
outputClaimToHeaders:
- header: kubeflow-userid
claim: email
- header: kubeflow-groups
claim: groups
# We explicitly set `fromHeaders` to ensure that the JWT is only extracted from the `Authorization` header.
# This is because we exclude requests that have an `Authorization` header from oauth2-proxy.
fromHeaders:
- name: Authorization
prefix: "Bearer "
```

You can also add more RequestAuthentication to support other issuers as for example for M2M access from github actions as explained in the root level Readme.md.
This feature is useful when you need to integrate Kubeflow with your current CI/CD platform (GitHub Actions, Jenkins) via machine-to-machine authentication.
The following is an example for obtaining and using a JWT token From your IDP with Python, but you can also just take a look at our CI/CD test that uses simple Kubernetes serviceaccount tokens to access KFP, Jupyterlabs etc. from GitHub Actions.

```
import requests
token_url = "https://your-idp.com/oauth/token"
client_id = "YOUR_CLIENT_ID"
client_secret = "YOUR_CLIENT_SECRET"
username = "YOUR_USERNAME"
password = "YOUR_PASSWORD"
# request header
headers = {
"Content-Type": "application/x-www-form-urlencoded"
}
data = {
"grant_type": "password",
"client_id": client_id,
"client_secret": client_secret,
"username": username,
"password": password,
"scope": "openid profile email" #change your scope
}
response = requests.post(token_url, headers=headers, data=data)
TOKEN = response.json()['access_token']
```

```
import kfp
kubeflow_host="https://your_host"
pipeline_host = kubeflow_host + "/pipeline"
client = kfp.Client(host=pipeline_host, existing_token=TOKEN)
print(client.list_runs(namespace="your-profile-name"))
```

## Known Issues:

Some openidc providers such as Azure provide too large JWTs / Cookies that exceed the limit of most GRPC and gunicorn web application deployments in Kubeflow.
If removing the groups claim in oauth2-proxy is not enough then you can add an envrionment variable to all web applications

```
apiVersion: apps/v1
kind: Deployment
metadata:
name: kserve-models-web-app
namespace: kubeflow
spec:
template:
spec:
containers:
- name: kserve-models-web-app # repeat for all other *-web-app-(deployment)
env:
- name: GUNICORN_CMD_ARGS
value: --limit-request-field_size 32000
```

and modify the KFP GRPC server via
```
- path: patches/metadata-grpc-virtualservice-patch.yaml
target:
kind: VirtualService
name: metadata-grpc
namespace: kubeflow
# patches/metadata-grpc-virtualservice-patch.yaml
# Remove the oauth2-proxy cookie that violates the maximum metadata size for a GRPC request
- op: add
path: /spec/http/0/route/0/headers
value:
request:
remove:
- Cookie
```

to fix `received initial metadata size exceeds limit`.
![Kubeflow Auth Diagram](./components/kubeflow_auth_diagram.svg)

## Kubeflow Notebooks User and M2M Authentication and Authorization

Expand Down Expand Up @@ -292,7 +119,7 @@ This is based on the following:
The docs above mention that while it's possible to enable authentication,
authorization is more complicated and probably we need to add
`AuthorizationPolicy`...
`AuthorizationPolicy`

> create an [Istio AuthorizationPolicy](https://istio.io/latest/docs/reference/config/security/authorization-policy/) to grant access to the pods or disable it

Expand Down
8 changes: 4 additions & 4 deletions contrib/ray/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ TODO
</figure>

## Step 1: Install Kubeflow
* This example installs Kubeflow with the master branch.
* This example installs Kubeflow with the master branch
* Install all Kubeflow official components and all common services using [one command](https://github.com/kubeflow/manifests/tree/master#install-with-a-single-command).
* If you do not want to install all components, you can comment out **KNative**, **Katib**, **Tensorboards Controller**, **Tensorboard Web App**, **Training Operator**, and **KServe** from [example/kustomization.yaml](https://github.com/kubeflow/manifests/blob/master/example/kustomization.yaml).

Expand All @@ -59,7 +59,7 @@ kubectl get pod -l app.kubernetes.io/component=kuberay-operator -n kubeflow
# Create a namespace: example-"development"
kubectl create ns development

# Enable isito-injection for the namespace
# Enable istio-injection for the namespace
kubectl label namespace development istio-injection=enabled

# After creating the namespace, You have to do below mentioned changes in raycluster_example.yaml file(Required changes are also mentioned as comments in yaml file itself)
Expand All @@ -69,7 +69,7 @@ kubectl label namespace development istio-injection=enabled
principals:
- "cluster.local/ns/development/sa/default-editor"

# 02. Replace the nampespace of node-ip-address of headGroupSpec and workerGroupSpec
# 02. Replace the namespace of node-ip-address of headGroupSpec and workerGroupSpec

node-ip-address: $(hostname -I | tr -d ' ' | sed 's/\./-/g').raycluster-istio-headless-svc.development.svc.cluster.local
```
Expand All @@ -78,7 +78,7 @@ kubectl label namespace development istio-injection=enabled
```sh
# Create a RayCluster CR, and the KubeRay operator will reconcile a Ray cluster
# with 1 head Pod and 1 worker Pod.
# $MY_KUBEFLOW_USER_NAMESPACE is the namesapce that has been created in the above step.
# $MY_KUBEFLOW_USER_NAMESPACE is the namespace that has been created in the above step.
export MY_KUBEFLOW_USER_NAMESPACE=development
kubectl apply -f raycluster_example.yaml -n $MY_KUBEFLOW_USER_NAMESPACE

Expand Down

0 comments on commit debe838

Please sign in to comment.