-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTAGENT-254 Add support for enhanced RBAC permissions for otel-agent #1693
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me, I think you have a typo, but that's about it.
charts/datadog/README.md
Outdated
@@ -803,6 +803,8 @@ helm install <RELEASE_NAME> \ | |||
| datadog.otelCollector.config | string | `nil` | OTel collector configuration | | |||
| datadog.otelCollector.enabled | bool | `false` | Enable the OTel Collector | | |||
| datadog.otelCollector.ports | list | `[{"containerPort":"4317","name":"otel-grpc"},{"containerPort":"4318","name":"otel-http"}]` | Ports that OTel Collector is listening | | |||
| datadog.otelCollector.rbac.create | bool | `true` | If true, create RBAC resources for OTel Collector | | |||
| datadog.otelCollector.rbac.rules | list | `[]` | A set of additional RBAC rules for OTel Collector | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rules
or roles
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's rules
. A single ClusterRole should be enough for otel-agent
. If user have components that requires additional RBAC permissions, they can add these as rules to the ClusterRole via datadog.otelCollector.rbac.rules
param; no need to create and bind additional ClusterRole(s).
Some of the users of opentelemetry-helm-charts might be familiar with this approach. It's very similar to what they use for the components that require custom RBAC permissions (components that aren't covered by presets) -- user can explicitly list additional rules that will be added to ClutserRole:
https://github.com/open-telemetry/opentelemetry-helm-charts/blob/main/charts/opentelemetry-collector/values.yaml#L226-L237
charts/datadog/ci/agent-otel-collector-with-rbac-custom-rules-values.yaml
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,40 @@ | |||
{{- if and .Values.agents.rbac.create (eq (include "should-enable-otel-agent" .) "true") .Values.datadog.otelCollector.rbac.create -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers: should-enable-otel-agent
defined here
charts/datadog/README.md
Outdated
@@ -803,6 +803,8 @@ helm install <RELEASE_NAME> \ | |||
| datadog.otelCollector.config | string | `nil` | OTel collector configuration | | |||
| datadog.otelCollector.enabled | bool | `false` | Enable the OTel Collector | | |||
| datadog.otelCollector.ports | list | `[{"containerPort":"4317","name":"otel-grpc"},{"containerPort":"4318","name":"otel-http"}]` | Ports that OTel Collector is listening | | |||
| datadog.otelCollector.rbac.create | bool | `true` | If true, create RBAC resources for OTel Collector | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit unclear from name / description that it's for k8sattributes
processor. The field that creates the rbac in the otel helm charts is named kubernetesAttributes (although that one also adds the processor to pipelines). I think either changing the name or description to make this clearer would be good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've change description, hope it's clearer now:
datadog.otelCollector.rbac.create
controls creation of additional ClusterRole forotel-agent
required by Kubernetes Attributes processor.datadog.otelCollector.rbac.rules
can be used to support additional RBAC permissions required by OTel components that are not included by default withotel-agent
.
I prefer not to tie the params specifically to k8sattributes
processor for 2 reasons:
- As you mentioned, we're not replicating the preset behavior from OTel Helm chart (we don't automatically add processors to pipelines, we only provide the necessary RBAC permissions if the customer decides to use
k8sattributes
processor) - Customers can re-use these params for other components we don't currently support. For example, although Datadog's Kubernetes integration already collects K8s events, metrics, and logs out-of-the-box, come customers may chose to continue using
k8sobjects
receiver. In that case, they will be able to usedatadog.otelCollector.rbac.rules
to define any additional permissions required to collect collect Kubernetes events.
By keeping the parameter names more general, we can cover a range of potential OTel Collector use cases instead of tying everything to just one processor.
c9ff915
to
a9632ef
Compare
63669c7
to
8338179
Compare
8338179
to
c27e362
Compare
What this PR does / why we need it:
Add support for additional RBAC permissions required by
k8sattributes
processor.RBAC Permission Issues in otel-agent
The OpenTelemetry Collector’s k8sattributes processor requires specific RBAC permissions to connect to the Kubernetes API server in order to enrich telemetry data with Kubernetes metadata. However, the default Service Account created by the Helm chart doesn't have these permissions: node agent (deployed as DaemonSet) retrieves pod information from the Kubelet
/pod
endpoints. This approach avoids overloading the Kubernetes API server in large clusters, but leads to errors in otel-agent when k8sattributes enabled: (Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User <serviceAccount> cannot list resource "pods" in API group "" at the cluster scope
)Proposed Solution
Create the necessary RBAC ClusterRole and ClusterRoleBinding for the
k8sattributes
processor. The creation logic will be tied to a new boolean parameter –datadog.otelCollector.rbac.create
, which, when set totrue
, enables the chart to inspect the OTel configuration. If anyk8sattributes
processor is set topassthrough: false
(i.e., it needs to call the Kubernetes API), the Helm chart will generate the required RBAC resources.Implementation Details
datadog.otelCollector.rbac.create
(true
|false
).agents.rbac.create: true
anddatadog.otelCollector.rbac.create: true
, the Helm chart checks the OTel Collector configuration for anyk8sattributes
processors configured withpassthrough
option disabled (either by omitting it or by explicitly setting itpassthrough: false
).datadog.otelCollector.rbac.rules: []
agents.rbac.create:true
anddatadog.otelCollector.rbac.create:true
, anddatadog.otelCollector.rbac.rules
provided, these rules are added to theotel-agent
ClusterRole.Special notes for your reviewer:
Why Cluster Role for additional RBAC permissions?
With just a role binding, the
k8sattributes
processor cannot query metadata such as labels and annotations from k8snodes
andnamespaces
which are cluster-scoped objects. This also means that the processor cannot set the value fork8s.cluster.uid
attribute if enabled, since thek8s.cluster.uid
attribute is set to the uid of the namespacekube-system
which is not queryable with namespaced rbac.Why introduce a separate
datadog.otelCollector.rbac
parameter vs reusingagents.rbac
?Note: technically, any ClusterRole and ClusterRoleBinding introduced under
datadog.otelCollector.rbac
will apply to the entire pod housing the Datadog Agent with Embedded OTel Collector. While this may not perfectly map to the container-based naming convention, it provides a convenient and explicit toggle for customers. Existing precedent – datadog.secretBackend.roles parameter: scoped to Secret Backend feature; changes affect the default Service Account, and thus additional permissions available to all containers in the node agent pod.Checklist
.github/helm-docs.sh
)CHANGELOG.md
has been updatedREADME.md