Skip to content

Commit

Permalink
Merge pull request #1 from PhilipSchmid/add_non_hostpol_example
Browse files Browse the repository at this point in the history
Added netpol examples without host policies
  • Loading branch information
PhilipSchmid authored Nov 14, 2023
2 parents fa02af5 + 6f6dbc1 commit bafed98
Show file tree
Hide file tree
Showing 35 changed files with 798 additions and 45 deletions.
13 changes: 13 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
version: 2
updates:
- package-ecosystem: "terraform"
directory: /deploy
schedule:
interval: "daily"
time: "23:00"
timezone: "Europe/Zurich"
open-pull-requests-limit: 3
rebase-strategy: "disabled"
labels:
- ci/dependabot
- kind/enhancement
53 changes: 53 additions & 0 deletions .github/workflows/github-actions.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
name: Validation Actions
on:
pull_request_target:
types:
- opened
- synchronize
- reopened
push:
branches:
- main
jobs:
formatting:
runs-on: ubuntu-22.04
steps:
- name: Checkout
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11
with:
ref: ${{ github.sha }}
- name: terraform fmt
uses: dflook/terraform-fmt-check@529e30563b2c558dc0b8c450b5cec1cc93bd7fe4
with:
path: /deploy
docs:
runs-on: ubuntu-22.04
needs: formatting
steps:
- name: Checkout
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11
with:
ref: ${{ github.ref }}
- name: Render terraform docs inside the README.md and push changes back to PR branch
uses: terraform-docs/gh-actions@d1c99433f7a1e5003ef213d70f89aaa47cb0b675
with:
working-dir: /deploy
output-file: README.md
output-method: inject
output-format: markdown table
indention: 3
git-push: "true"
validate-netpol-yamls:
runs-on: ubuntu-22.04
needs: formatting
steps:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11
- name: Validate Network Policy YAML files
run: yamllint netpols/*/*.yaml
validate-cilium-valuesyaml:
runs-on: ubuntu-22.04
needs: formatting
steps:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11
- name: Validate Cilium Helm values YAML file
run: yamllint deploy/03-cilium-values-1.14.yaml
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,11 @@ This is a demo of how to lock down your Kubernetes cluster using advanced [Ciliu

![architecture overview](pictures/architecture-overview.jpg)

If you would like to apply those or similar policies to your existing clusters, it's still possible without too much effort by leveraging Cilium Hubble's visibility capabilities to see if a newly introduced Cilium Network Policy causes unwanted traffic denies or not. You should check out our free [Isovalent Zero Trust hands-on lab](https://isovalent.com/labs/cilium-enterprise-zero-trust-visibility/) in case you are eager to learn more about our recommended way of validating new Cilium Network Policies before applying them to existing clusters.
If you would like to apply those or similar policies to your existing clusters, it's still possible without too much effort by leveraging Cilium Hubble's visibility capabilities to see if a newly introduced Cilium Network Policy causes unwanted traffic denies or not. You should check out Isovalent's free [Zero Trust hands-on lab](https://isovalent.com/labs/cilium-enterprise-zero-trust-visibility/) in case you are eager to learn more about our recommended way of validating new Cilium Network Policies before applying them to existing clusters.

* Go to the `slides` directory to see slide decks of talks I did based on this demo setup.
* Head over to the `deploy` directory to see how the demo Kubeadm Kubernetes cluster and infrastructure components are deployed.
* Check the `netpols` directory to see the actual Cilium (Cluster-wide) Network Policies.
* Check the `netpols/no-host-policies` directory to see the actual Cilium (Cluster-wide) Network Policies.
* Check the `netpols/with-host-policies` directory to see the actual Cilium (Cluster-wide) Network Policies where Cilium Host Policies are used as well (Host Firewall).

More examples and even hands-on labs on how to leverage Cilium Network Policies can be found in the free [Isovalent "Security Professional" learning track](https://isovalent.com/learning-tracks/#securityProfessionals).
7 changes: 7 additions & 0 deletions deploy/.terraform-docs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# https://terraform-docs.io/user-guide/configuration/
formatter: "markdown table"
output:
file: README.md
mode: inject
settings:
indent: 3
2 changes: 1 addition & 1 deletion deploy/01-vpc.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ resource "random_id" "cluster" {
}

module "vpc" {
source = "git::ssh://git@github.com/isovalent/terraform-aws-vpc.git?ref=v1.5"
source = "git::ssh://git@github.com/isovalent/terraform-aws-vpc.git?ref=v1.7"

cidr = var.vpc_cidr
name = "${var.cluster_name}-${random_id.cluster.dec}"
Expand Down
2 changes: 1 addition & 1 deletion deploy/02-kubeadm.tf
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module "kubeadm" {
source = "git::ssh://git@github.com/isovalent/terraform-aws-kubeadm.git?ref=v2.4"
source = "git::ssh://git@github.com/isovalent/terraform-aws-kubeadm.git?ref=v2.6.1"

vpc_id = module.vpc.id
ami_name_filter = var.ami_name_filter
Expand Down
32 changes: 17 additions & 15 deletions deploy/03-cilium-values-1.14.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,13 @@ ipam:
clusterPoolIPv4MaskSize: 25

# Routing/encapsulation mode
tunnel: vxlan
tunnelProtocol: "vxlan"
routingMode: "tunnel"

# KubeProxyReplacement
kubeProxyReplacement: "strict"
kubeProxyReplacement: "true"
k8sServiceHost: ${KUBE_APISERVER_HOST}
k8sServicePort: ${KUBE_APISERVER_PORT}
kubeProxyReplacementHealthzBindAddr: "0.0.0.0:10256"

# Enable support for Gateway API
gatewayAPI:
Expand All @@ -55,15 +55,17 @@ hubble:
enabled: true
metrics:
enableOpenMetrics: true
# https://docs.cilium.io/en/stable/observability/metrics/#hubble-exported-metrics
enabled:
- dns
- drop
- tcp
- flow
- port-distribution
- icmp
- httpV2:exemplars=true;labelsContext=source_ip,source_namespace,source_workload,destination_ip,destination_namespace,destination_workload,traffic_direction
# https://docs.cilium.io/en/stable/observability/metrics/#hubble-exported-metrics
# Remove `;query` from the `dns` line for production -> bad metrics cardinality
- dns:labelsContext=source_namespace,destination_namespace;query
- drop:labelsContext=source_namespace,destination_namespace
- tcp:labelsContext=source_namespace,destination_namespace
- port-distribution:labelsContext=source_namespace,destination_namespace
- icmp:labelsContext=source_namespace,destination_namespace;sourceContext=workload-name|reserved-identity;destinationContext=workload-name|reserved-identity
- flow:sourceContext=workload-name|reserved-identity;destinationContext=workload-name|reserved-identity;labelsContext=source_namespace,destination_namespace
- "httpV2:exemplars=true;labelsContext=source_ip,source_namespace,source_workload,destination_ip,destination_namespace,destination_workload,traffic_direction;sourceContext=workload-name|reserved-identity;destinationContext=workload-name|reserved-identity"
- "policy:sourceContext=app|workload-name|pod|reserved-identity;destinationContext=app|workload-name|pod|dns|reserved-identity;labelsContext=source_namespace,destination_namespace"
serviceMonitor:
enabled: true
dashboards:
Expand Down Expand Up @@ -147,7 +149,7 @@ operator:
enabled: true
serviceMonitor:
enabled: true
# Operator Dashboards
# Operator Dashboards
dashboards:
enabled: true
annotations:
Expand All @@ -161,6 +163,6 @@ prometheus:

# Cilium Agent Dashboards
dashboards:
enabled: true
annotations:
grafana_folder: "Cilium Agent Dashboards"
enabled: true
annotations:
grafana_folder: "Cilium Agent Dashboards"
3 changes: 2 additions & 1 deletion deploy/03-cilium.tf
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
module "cilium" {
source = "git::ssh://git@github.com/isovalent/terraform-k8s-cilium.git?ref=v1.3"
source = "git::ssh://git@github.com/isovalent/terraform-k8s-cilium.git?ref=v1.6.2"

depends_on = [
module.kubeadm
]

cilium_helm_release_name = "cilium"
wait_for_total_control_plane_nodes = true
cilium_helm_values_file_path = var.cilium_helm_values_file_path
cilium_helm_version = var.cilium_helm_version
cilium_helm_chart = var.cilium_helm_chart
Expand Down
10 changes: 9 additions & 1 deletion deploy/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Demo Infrastructure Deployment

For this demo infrastructure, some Isovalent internal Terraform modules were used to spin up a Kubeadm-based K8s cluster and installing Cilium. Nevertheless, you can still basically do the same in a manual way by following the [kubeadm Cluster Setup](https://gist.github.com/PhilipSchmid/e34a725d5836d21432fd10b0709a5c4a) guide. Ensure you're aware of the following things:

- kube-proxy shouldn't be installed, as we use Cilium's KubeProxyReplacement.
Expand All @@ -9,6 +10,8 @@ Also, check the following files to see how things are deployed:
- Cilium Helm values: `deploy/03-cilium-values-1.14.yaml`
- Infrastructure components: `deploy/scripts/deploy-uc.sh`

## Troubleshooting

SSH access via SSH jumphost:
```bash
ssh -i ~/.ssh/id_ed25519.pub \
Expand All @@ -20,4 +23,9 @@ ssh -i ~/.ssh/id_ed25519.pub \
-o UserKnownHostsFile=/dev/null \
rocky@<ssh-jumphost-public-ip>" \
rocky@<node-private-ip>
```
```

## Terraform Module Doc
<!-- BEGIN_TF_DOCS -->

<!-- END_TF_DOCS -->
2 changes: 1 addition & 1 deletion deploy/manifests/goldpinger/goldpinger.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ spec:
valueFrom:
fieldRef:
fieldPath: status.podIP
image: "docker.io/bloomberg/goldpinger:v3.7.0"
image: "docker.io/bloomberg/goldpinger:3.9.0"
imagePullPolicy: Always
securityContext:
allowPrivilegeEscalation: false
Expand Down
106 changes: 106 additions & 0 deletions netpols/no-host-policies/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# Demo Network Policies (without Host Policies)

## Generic Hubble Configuration for Visibility
```bash
# - Since we enabled TLS for Hubble, we need to configure Hubble CLI accordingly
# - Get Hubble CLI from here: https://docs.cilium.io/en/stable/gettingstarted/hubble_setup/#install-the-hubble-client
hubble config set tls true
hubble config set tls-ca-cert-files /path/to/cilium-netpol-demo/deploy/cilium-ca-crt.pem
hubble config set tls-server-name "*.hubble-relay.cilium.io"
# Open the port-forwarding in a separate shell:
kubectl port-forward -n kube-system svc/hubble-relay 4245:443
# Finally check Hubble CLI's connection and the flows:
hubble status
```

## Infrastructure Components
```bash
# Label namespaces with Ingress resources that should be reachable from Nginx ingress controller with `exposed=true`:
kubectl label namespace goldpinger exposed=true
kubectl label namespace monitoring exposed=true
kubectl label namespace kube-system exposed=true # Required for Hubble-UI. Move Hubble-UI to a dedicated namespaces for production: https://docs.cilium.io/en/stable/gettingstarted/hubble/#enable-the-hubble-ui ("Helm (Standalone install)" tab)

# Label namespaces with metric endpoints that should be scraped by Prometheus with `metrics=true`:
kubectl label namespace goldpinger metrics=true
kubectl label namespace kube-system metrics=true
kubectl label namespace ingress-nginx metrics=true

# Kube-system
kubectl apply -f cnp-infra-kube-system.yaml

# Nginx Ingress Controller
kubectl apply -f cnp-infra-ingress-nginx.yaml

# Cert-Manager
kubectl apply -f cnp-infra-cert-manager.yaml

# Kube Prometheus Stack
kubectl apply -f cnp-infra-monitoring-stack.yaml

# Goldpinger (only in case Goldpinger is deployed on the cluster):
kubectl apply -f cnp-infra-goldpinger.yaml

# Finally, check for wrongly dropped flows:
hubble observe -t policy-verdict -f --verdict DROPPED
```

## Cluster-wide Policies
The goal should be to deploy new user workload namespaces with only a very small set of default policies. Hence, you can leverage CiliumClusterwideNetworkPolicies (CCNPs) to predefine permits of common services like ingress or monitoring to communicate with the new namespace.

In addition, you can leverage CCNPs to enable Cilium's DNS visibility by applying an egress policy that uses `toPorts[*].rules.dns`.

```bash
kubectl apply -f ccnp-global-infra.yaml
```

## User Workload
As there are already CCNPs matching for everything `spec.endpointSelector: {}`, in both directions (`spec.ingress` and `spec.egress`), newly created (user) namespaces are already in a deny-all state. As a result, even namespace internal traffic is denied until a new `allow-within-namespace` CiliumNetworkPolicy (CNP) is created to allow this traffic:

```yaml
---
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: allow-within-namespace
namespace: my-namespace-xy
spec:
description: Allow NS internal traffic, block everything else
endpointSelector: {}
ingress:
- fromEndpoints:
- {}
egress:
- toEndpoints:
- {}
```
Have a look at the following template that could be used for new user workload namespaces.
- `cnp-user-template.yaml`:
- Add the namespace label `exposed: "true"` in case Nginx ingress should be able to serve Ingresses from this namespace.
- Add the namespace label `metrics: "true"` in case Prometheus should be able to scrape metrics endpoints from this namespace.
- Allow all ingress and egress traffic **within** the namespace
- Optional: Additional application specific CNPs to explicitly allow connections to and from namespace-external sources/destinations.

Check out `demo-app-podinfo.sh` to simulate the deployment of new user workload.

## Troubleshooting
To troubleshoot connectivity issues or false positive denies, use Hubble UI and especially Hubble CLI. Hubble can either be directly used within a Cilium agent pod (only sees node local traffic) or in an even more powerful way, via the dedicated [Hubble CLI](https://docs.cilium.io/en/stable/gettingstarted/hubble_setup/#install-the-hubble-client). This Hubble CLI then needs to point to the Hubble-Relay service which aggregates the flows from all Cilium agents / nodes.

```bash
# Temporarily expose the `hubble-relay` ClusterIP service via `kubectl port-forward` (blocking call, separate shell):
kubectl port-forward -n kube-system svc/hubble-relay 4245:443

# Check for dropped traffic:
hubble observe -t policy-verdict -f --verdict DROPPED
```

Improve your Hubble CLI outputs even further by using additional filtering constraints (issue `hubble observe --help` to see all available options):
- `--ip` / `--to-ip` / `--from-ip`
- `-n` / `--namespace` / `--to-namespace` / `--from-namespace`
- `--port` / `--to-pod` / `--from-pod`
- `--node-name`

## Sources:
- https://docs.cilium.io/en/stable/gettingstarted/hubble_setup/#install-the-hubble-client
- https://docs.cilium.io/en/stable/gettingstarted/hubble_cli/#hubble-cli
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ spec:
- fields:
- type: 8
family: IPv4
# Heads-up: In case you remove the metrics label again from the NS, check https://github.com/cilium/cilium/issues/27626.
---
apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
Expand All @@ -60,7 +59,6 @@ spec:
- matchLabels:
k8s:io.kubernetes.pod.namespace: monitoring
k8s:app.kubernetes.io/name: prometheus
# Heads-up: In case you remove the exposed label again from the NS, check https://github.com/cilium/cilium/issues/27626.
---
apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
Expand All @@ -76,4 +74,4 @@ spec:
- fromEndpoints:
- matchLabels:
k8s:io.kubernetes.pod.namespace: ingress-nginx
k8s:app.kubernetes.io/name: ingress-nginx
k8s:app.kubernetes.io/name: ingress-nginx
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ spec:
- ports:
- port: "6443"
protocol: TCP
# Only required in case Let's Encrypt (Cluster)Issuer are used:
---
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
Expand All @@ -51,6 +52,7 @@ spec:
- ports:
- port: "443"
protocol: TCP
# Only required in case Let's Encrypt (Cluster)Issuer are used:
---
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
Expand Down Expand Up @@ -87,4 +89,4 @@ spec:
toPorts:
- ports:
- port: "10250"
protocol: TCP
protocol: TCP
Loading

0 comments on commit bafed98

Please sign in to comment.