diff --git a/README.md b/README.md index d3cccc7..3ab4e47 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,46 @@ # Telemetry Controller -Telemetry Controller collects, routes and forwards telemetry data (logs, metrics and traces) from Kubernetes clusters +Telemetry Controller collects, routes, and forwards telemetry data (logs, metrics and traces) from Kubernetes clusters supporting multi-tenancy out of the box. +The Telemetry Controller provides isolation and access control for telemetry data, similar to what Kubernetes provides for pods, secrets, and other resources. It provides an opinionated, convenient, and robust multi-tenant API on top of OpenTelemetry, and introduces new resources that give granular control over the shared data, while hiding the complexity of setting up and maintaining OpenTelemetry Collector manually. + ## Description -Telemetry-controller can be configured using Custom Resources to set up an opinionated Opentelemetry Collector configuration to route log messages based on rules defined as a Tenant -> Subscription relation map. +Telemetry Controller can be configured using Custom Resources to set up an [opinionated Opentelemetry Collector](#under-the-hood) configuration to route log messages based on rules defined as a Tenant -> Subscription relation map. That way: + +- Administrators can define a **collector** and **tenants** to provide isolation and access control for telemetry data. These are cluster scoped resources. +- Users can create **subscriptions** to select telemetry data streams that only their tenant can access. +- Users can create or refer the available **outputs** in their **subscriptions** to route and transport data. That way users can configure what they want to collect and where they want to send it - within their tenant’s scope. + +![Telemetry Controller flow diagram](docs/overview.svg) + +Telemetry Controller can collect container logs that come from stdout/stderr and are written to the host filesystem by the container runtime. + +### Collector + +Collectors specify global settings for the OTEL Collector DaemonSet, and a `tenantSelector` that lists the Tenants that the collector should pick up. The collector also attaches metadata to the telemetry data sources: for Kubernetes logs, it fetches additional metadata like pod labels and adds those as attributes to log records. + +### Tenants + +Typically, a tenant is a set of Kubernetes namespaces, which is a best practice for managing multi-tenant workloads inside a single cluster. Tenant resources specify: + +- `subscriptionNamespaceSelectors` for namespaces that select subscriptions created by the tenant users, and +- `logSourceNamespaceSelectors` that specify the namespaces where the logs are produced (that are also the concern of the tenant users). + +In trivial use cases these two label selectors are the same. + +The Tenant is actually a routing rule that helps to make sure that telemetry data is only accessible to a given Subscription if it matches the policies set by the administrator. + +### Subscriptions + +Tenant users can define their Subscriptions in the namespace(s) of their Tenants. Subscriptions can select from the telemetry data (that is already filtered as part of the Tenant definition) and set Output endpoints where the data is forwarded. Such an endpoint can be: + +- an aggregator, for example, [Logging operator](https://kube-logging.dev/), +- a remote telemetry backend, for example, Loki, Jaeger, or Prometheus, or +- a managed service provider, for example, Splunk or Sumo Logic. + +![Telemetry Controller CR flow](docs/telemetry-controller-flow.png) ## Getting Started @@ -70,18 +105,18 @@ make deploy IMG=telemetry-controller:latest ### Example setup You can deploy the example configuration provided as part of the docs. This will deploy a demo pipeline with one tenant, two subscriptions, and an OpenObserve instance. -Deploying Openobserve is an optional, but recommended step, logs can be forwarded to any OTLP endpoint. Openobserve provides a UI to visualize the ingested logstream. +Deploying OpenObserve is an optional, but recommended step, logs can be forwarded to any OTLP endpoint. OpenObserve provides a UI to visualize the ingested logstream. ```sh -# Deploy Openobserve +# Deploy OpenObserve kubectl apply -f docs/examples/simple-demo/openobserve.yaml -# Set up portforwarding for Openobserve UI +# Set up portforwarding for OpenObserve UI kubectl -n openobserve port-forward svc/openobserve 5080:5080 & ``` Open the UI at `localhost:5080`, navigate to the `Ingestion/OTEL Collector` tab, and copy the authorization token as seen on the screenshot. -![Openobserve auth](docs/assets/openobserve-auth.png) +![OpenObserve auth](docs/assets/openobserve-auth.png) Paste this token to the example manifests: @@ -100,18 +135,20 @@ Create a workload, which will generate logs for the pipeline: helm install --wait --create-namespace --namespace example-tenant-ns --generate-name oci://ghcr.io/kube-logging/helm-charts/log-generator ``` -Open the Openobserve UI and inspect the generated log messages: +Open the OpenObserve UI and inspect the generated log messages: -Set up portforwarding for Openobserve UI +Set up portforwarding for OpenObserve UI ```sh kubectl -n openobserve port-forward svc/openobserve 5080:5080 ``` -![Openobserve logs](docs/assets/openobserve-logs.png) +![OpenObserve logs](docs/assets/openobserve-logs.png) ### Sending logs to logging-operator (example) +(For a more detailed description see our [Sending data to the Logging Operator](https://axoflow.com/kubernetes-logging-telemetry-controller-logging-operator/) blog post.) + Install dependencies (cert-manager and opentelemetry-operator): ```sh @@ -154,6 +191,32 @@ Apply the provided example resource for telemetry-controller: [telemetry-control kubectl apply -f telemetry-controller.yaml ``` +## Under the hood + +Telemetry Controller uses a [custom OpenTelemetry Collector distribution](https://github.com/axoflow/axoflow-otel-collector-releases) as its agent. This distribution is and will be compatible with the upstream OpenTelemetry Collector distribution regarding core features, but: + +- We reduce the footprint of the final image by removing unnecessary components. This reduces not just the size, but also the vulnerability surface of the collector. +- We include additional components with features not available in the upstream OpenTelemetry Collector, for example, to provide a richer set of metrics. +- We use the OpenTelemetry Operator as the primary controller to implicitly manage the collector. + +OpenTelemetry Collector runs as a DaemonSet, mounting and reading the container log files present on the node. During the initial parsing of the log entries, we extract the pod name, pod namespace, and some other metadata. This allows us to associate the log entry to the respective source pod through the Kubernetes API, and to fetch metadata which cannot be extracted from the message alone. + +## Support + +If you encounter problems while using the Telemetry Controller, [open an issue](https://github.com/kube-logging/telemetry-controller/issues) or talk to us on the [#logging-operator Discord channel](https://discord.gg/eAcqmAVU2u). + +## Further info + +For further information, use cases, and tutorials, read our [blog posts about Telemetry Controller](https://axoflow.com/tag/telemetry-controller/), for example: + +- [Introduction and getting started with Telemetry Controller](https://axoflow.com/reinvent-kubernetes-logging-with-telemetry-controller/) +- [How to send Kubernetes logs to Loki](https://axoflow.com/send-kubernetes-logs-to-loki-with-telemetry-controller/) +- [Sending data to the Logging Operator](https://axoflow.com/kubernetes-logging-telemetry-controller-logging-operator/) + +We also give talks about Telemetry Controller at various open source conferences, for example, at Open Source Summit Europe 2024: + + + ## Contributing If you find this project useful, help us: diff --git a/docs/overview.svg b/docs/overview.svg new file mode 100644 index 0000000..3ac7a61 --- /dev/null +++ b/docs/overview.svg @@ -0,0 +1,577 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Output-01 + + A/app1 + + + + + + + + + + + + + + + + + + + + + + + + + + Output-02 + + A/app1 + + + + + + + A/app2 + + + + + + + + + + + + + + + + + + + + + + Subscription S1 + + A/app1 + + + + + + + + + + + + + + + + + ? + + + Unmatched + + A/other + + + + + + + + + + + + + + + + + + + + + Tenant T1 + + A/app1 + + + + + + + A/app1 + + + + + + + A/other + + + + + + + + Namespace: A + + + + + + + + + + + + + + + + + + + + + + + + + + Output-01 + + B/appX + + + + + + + + + + + + + + + + + + + + + + Subscription S1 + + B/appX + + + + + + + + + + + + + + + + + + + + + + + ? + + + Unmatched + + B/other + + + + + + + + + + + + + + + + + + + + + Tenant T2 + + B/appX + + + + + + + + Namespace: B + + + + + + + + + + + + + + Unmatched + + C/other + + + + + + + ? + + + + + + + + + + + + + + + + + + + + + + + + + + + + Collector + + + + + + + + + + + + + + + + Kubernetes + + + A/app1 + + + + + + + A/app2 + + + + + + + A/other + + + + + + + B/app1 + + + + + + + B/other + + + + + + + C/other + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/telemetry-controller-flow.png b/docs/telemetry-controller-flow.png new file mode 100644 index 0000000..43684a0 Binary files /dev/null and b/docs/telemetry-controller-flow.png differ