Skip to content

Commit

Permalink
docs: Improved and consistent diagrams (kyma-project#1254)
Browse files Browse the repository at this point in the history
Co-authored-by: Stanislav Khalash <stanislav.khalash@gmail.com>
Co-authored-by: Nina Hingerl <76950046+NHingerl@users.noreply.github.com>
  • Loading branch information
3 people authored Jul 25, 2024
1 parent 304301c commit 4db1981
Show file tree
Hide file tree
Showing 20 changed files with 6,370 additions and 614 deletions.
12 changes: 10 additions & 2 deletions docs/user/01-manager.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,17 @@ As the core element of the Telemetry module, Telemetry Manager manages the lifec
The Telemetry module includes Telemetry Manager, a Kubernetes [operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) that's described by a custom resource of type Telemetry. Telemetry Manager has the following tasks:

1. Watch for the user-created Kubernetes resources LogPipeline, TracePipeline, and MetricPipeline. In these resources, you specify what data of a signal type to collect and where to ship it.
2. If it finds such a custom resource: Roll out the relevant components on demand and keep it in sync with the pipeline.
2. Watch the module configuration for changes and sync the module status to it.
3. Manage the lifecycle of the self monitor and the user-configured agents and gateways.
For example, only if you defined a LogPipeline resource, the Fluent Bit DaemonSet is deployed as log agent.

![Manager](assets/manager-lifecycle.drawio.svg)
![Manager](assets/manager-resources.drawio.svg)

### Self Monitor

The Telemetry module contains a self monitor, based on [Prometheus](https://prometheus.io/), to collect and evaluate metrics from the managed gateways and agents. Telemetry Manager retrieves the current pipeline health from the self monitor and adjusts the status of the pipeline resources and the module status.

![Self-Monitor](assets/manager-arch.drawio.svg)

## Module Configuration and Status

Expand Down
38 changes: 21 additions & 17 deletions docs/user/02-logs.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ The Telemetry module provides the [Fluent Bit](https://fluentbit.io/) log agent

You can configure the log agent with external systems using runtime configuration with a dedicated Kubernetes API (CRD) named `LogPipeline`. With the LogPipeline's HTTP output, you can natively integrate with vendors that support this output, or with any vendor using a [Fluentd integration](https://medium.com/hepsiburadatech/fluent-logging-architecture-fluent-bit-fluentd-elasticsearch-ca4a898e28aa).

The feature is optional, if you don't want to use the Logs feature, simply don't set up a LogPipeline.

<!--- custom output/unsupported mode is not part of Help Portal docs --->
If you want more flexibility than provided by the proprietary protocol, you can run the agent in the [unsupported mode](#unsupported-mode), using the full vendor-specific output options of Fluent Bit. If you need advanced configuration options, you can also bring your own log agent.

Expand All @@ -17,19 +19,31 @@ Your application must log to `stdout` or `stderr`, which ensures that the logs c

## Architecture

### Log Agent

[Fluent Bit](https://fluentbit.io/), as a log agent, collects all application logs of the cluster workload and ships them to a backend.
In the Kyma cluster, the Telemetry module provides a DaemonSet of [Fluent Bit](https://fluentbit.io/) acting as a agent. The agent tails container logs from the Kubernetes container runtime and ships them to a backend.

![Architecture](./assets/logs-arch.drawio.svg)

1. Container logs are stored by the Kubernetes container runtime under the `var/log` directory and its subdirectories.
2. Fluent Bit runs as a [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/) (one instance per Node), detects any new log files in the folder, and tails them using a filesystem buffer for reliability.
3. Fluent Bit queries the [Kubernetes API Server](https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/) for additional Pod metadata, such as Pod annotations and labels.
4. The Telemetry module configures Fluent Bit with your output configuration.
5. As specified in your LogPipeline configuration, Fluent Bit sends the log data to observability systems outside or inside the Kyma cluster. Here, you can use the integration with HTTP to integrate a system directly or with an additional Fluentd installation.
3. Fluent Bit discovers additional Pod metadata, such as Pod annotations and labels
4. Telemetry Manager configures Fluent Bit with your output configuration, observes the log flow, and reports problems in the LogPipeline status.
5. The log agent sends the data to the observability system that's specified in your `LogPipeline` resource - either within the Kyma cluster, or, if authentication is set up, to an external observability backend. You can use the integration with HTTP to integrate a system directly or with an additional Fluentd installation.
6. To analyze and visualize your logs, access the internal or external observability system.
7. The self monitor observes the log flow to the backend and reports problems in the LogPipeline status.

### Telemetry Manager

The LogPipeline resource is watched by Telemetry Manager, which is responsible for generating the custom parts of the Fluent Bit configuration.

![Manager resources](./assets/logs-resources.drawio.svg)

1. Telemetry Manager watches all LogPipeline resources and related Secrets.
2. Furthermore, Telemetry Manager takes care of the full lifecycle of the Fluent Bit DaemonSet itself. Only if you defined a LogPipeline, the agent is deployed.
3. Whenever the configuration changes, Telemetry Manager validates the configuration (with a [validating webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/)) and generates a new configuration for the Fluent Bit DaemonSet, where several ConfigMaps for the different aspects of the configuration are generated.
4. Furthermore, referenced Secrets are copied into one Secret that is also mounted to the DaemonSet.

### Log Agent

If a LogPipeline is defined, a DaemonSet is deployed acting as an agent. The agent is based on [Fluent Bit](https://fluentbit.io/) and encompasses the collection of application logs provided by the Kubernetes container runtime. The agent sends all data to the configured backend.

### Pipelines
<!--- Pipelines is not part of Help Portal docs --->
Expand All @@ -48,16 +62,6 @@ This approach ensures reliable buffer management and isolation of pipelines, whi

This approach assures a reliable buffer management and isolation of pipelines, while keeping flexibility on customizations.

### Telemetry Manager

The LogPipeline resource is watched by Telemetry Manager, which is responsible for generating the custom parts of the Fluent Bit configuration.

![Manager resources](./assets/logs-resources.drawio.svg)

- Telemetry Manager watches all LogPipeline resources and related Secrets.
- Whenever the configuration changes, Telemetry Manager validates the configuration (with a [validating webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/)) and generates a new configuration for the Fluent Bit DaemonSet, where several ConfigMaps for the different aspects of the configuration are generated.
- Furthermore, referenced Secrets are copied into one Secret that is also mounted to the DaemonSet.

## Setting up a LogPipeline

In the following steps, you can see how to construct and deploy a typical LogPipeline. Learn more about the available [parameters and attributes](resources/02-logpipeline.md).
Expand Down
34 changes: 19 additions & 15 deletions docs/user/03-traces.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@ The following diagram shows how distributed tracing helps to track the request p

![Distributed tracing](./assets/traces-intro.drawio.svg)

The Telemetry module provides a trace gateway for the shipment of traces of any container running in the Kyma runtime.

You can configure the trace gateway with external systems using runtime configuration with a dedicated Kubernetes API (CRD) named TracePipeline.
The Trace feature is optional. If you don't want to use it, simply don't set up a TracePipeline.

## Prerequisites

For the recording of a distributed trace, every involved component must propagate at least the trace context. For details, see [Trace Context](https://www.w3.org/TR/trace-context/#problem-statement).
Expand All @@ -24,33 +29,32 @@ For the recording of a distributed trace, every involved component must propagat

In the Kyma cluster, the Telemetry module provides a central deployment of an [OTel Collector](https://opentelemetry.io/docs/collector/) acting as a gateway. The gateway exposes endpoints to which all Kyma modules and users’ applications should send the trace data to.

The feature is optional, if you don't want to use the Traces feature, simply don't set up a TracePipeline.

![Architecture](./assets/traces-arch.drawio.svg)

1. An end-to-end request is triggered and populates across the distributed application. Every involved component propagates the trace context using the [W3C Trace Context](https://www.w3.org/TR/trace-context/) protocol.
2. After contributing a new span to the trace, the involved components send the related span data to the trace gateway using the `telemetry-otlp-traces` service. The communication happens based on the [OpenTelemetry Protocol (OTLP)](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/otlp.md) either using GRPC or HTTP.
3. The trace gateway enriches the span data with metadata that's typical for sources running on Kubernetes, like Pod identifiers.
4. With the `TracePipeline` resource, the trace gateway is configured with a target backend.
5. The backend can run within the cluster.
6. If authentication has been set up, the backend can also run outside the cluster.
3. Istio sends the related span data to the trace gateway as well.
4. The trace gateway discovers metadata that's typical for sources running on Kubernetes, like Pod identifiers, and then enriches the span data with that metadata.
5. The Telemetry Manager configures the gateway according to the `TracePipeline` resource, including the target backend for the trace gateway. Also, it observes the trace flow to the backend and reports problems in the `TracePipeline` status.
6. The trace gateway sends the data to the observability system that's specified in your `TracePipeline` resource - either within the Kyma cluster, or, if authentication is set up, to an external observability backend.
7. You can analyze the trace data with your preferred backend system.
8. The self monitor observes the trace flow to the backend and reports problems in the TracePipeline status.

### Trace Gateway

In a Kyma cluster, the trace gateway is the central component to which all components can send their individual spans. The gateway collects, enriches, and dispatches the data to the configured backend. For more information, see [Telemetry Gateways](./gateways.md).

### Telemetry Manager

The TracePipeline resource is watched by Telemetry Manager, which is responsible for generating the custom parts of the OTel Collector configuration.

![Manager resources](./assets/traces-resources.drawio.svg)

- Telemetry Manager watches all TracePipeline resources and related Secrets.
- Whenever the configuration changes, it validates the configuration and generates a new configuration for OTel Collector, where a ConfigMap for the configuration is generated.
- Referenced Secrets are copied into one Secret that is mounted to the OTel Collector as well.
- Furthermore, Telemetry Manager takes care of the full lifecycle of the OTel Collector Deployment itself. Only if there is a TracePipeline defined, the collector is deployed.
1. Telemetry Manager watches all TracePipeline resources and related Secrets.
2. Furthermore, Telemetry Manager takes care of the full lifecycle of the OTel Collector Deployment itself. Only if you defined a TracePipeline, the collector is deployed.
3. Whenever the configuration changes, it validates the configuration and generates a new configuration for OTel Collector, where a ConfigMap for the configuration is generated.
4. Referenced Secrets are copied into one Secret that is mounted to the OTel Collector as well.

If you don't want to use the Traces feature, simply don't set up a TracePipeline.
### Trace Gateway

In a Kyma cluster, the trace gateway is the central component to which all components can send their individual spans. The gateway collects, enriches, and dispatches the data to the configured backend. For more information, see [Telemetry Gateways](./gateways.md).

## Setting up a TracePipeline

Expand Down Expand Up @@ -516,7 +520,7 @@ If you just want to see traces for one particular request, you can manually forc
**Symptom**:

- In the TracePipeline status, the `TelemetryFlowHealthy` condition has status **GatewayThrottling**.
- Also, your application might have error logs indicating a refusal for pushing traces to the gateway.
- Also, your application might have error logs indicating a refusal for sending traces to the gateway.

**Cause**: Gateway cannot receive spans at the given rate.

Expand Down
Loading

0 comments on commit 4db1981

Please sign in to comment.