Add an OVN Observability API enhancement #1693

npinaeva · 2024-10-07T13:52:15Z

OVN Observability is a feature that provides visibility for ovn-kubernetes-managed network by using sampling mechanism.
That means, that network packets generate samples with user-friendly description to give more information about what and why is happening in the network.
In openshift, this feature is integrated with the Network Observability Operator.

The purpose of this enhancement is to define an API to configure OVN observability in an openshift cluster.

cc @dceara @msherif1234 @jotak

jotak · 2024-10-08T06:53:05Z

/cc

enhancements/network/ovn-observability-api.md

dceara

FWIW, this looks good to me, especially for the Debugging use case (just a small nit).

Thanks!

enhancements/network/ovn-observability-api.md

dceara · 2024-11-07T15:30:56Z

enhancements/network/ovn-observability-api.md

+
+Filtering for debugging use case may be tricky as explained in "Observability filtering for debugging" section.
+But it is important to implement, as OVN implementation works in a way, where performance optimizations can only be
+applied when there is only 1 `ObservabilityConfig` for a given feature. That means, adding a second `ObservabilityConfig`


Unfortunately core OVN documentation doesn't mention the performance impact with multiple collectors attached to a single Sample/ACL but it should. I'll try to post a patch in ovn-org/ovn for that. If that lands in time it would be good to link to the core OVN documentation from here.

abhat · 2024-11-21T19:09:50Z

enhancements/network/ovn-observability-api.md

+In the future more observability features, like egressIP and services, but also networking-specific features 
+like logical port ingress/egress will be added.
+
+Every feature may set a sampling probability, that defines how often a packet will be sampled, in percent. 100% means every packet


Is per feature granularity sufficient? Should it be per feature per namespace instead? If we were to extend this to UDNs, I can imagine we want to have some per tenant control on the observability config.

do you expect an admin to disable observability in some namespaces, but enable in others?
We will have more granularity for debugging, but I am not sure if it is useful for regular cluster observability

jotak

Looks good to me,
FWIW with the tech preview feature, netobserv can generate fine grained metrics such as per-netpol /namespace / workload violations , so users are able to create alerts based on that

openshift-ci · 2024-11-27T14:05:49Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dceara, jotak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~enhancements/network/OWNERS~~ [jotak]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

openshift-ci · 2024-12-02T15:57:31Z

@npinaeva: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

trozet · 2024-12-10T15:28:32Z

enhancements/network/ovn-observability-api.md

+	// +kubebuilder:validation:Required
+	// +kubebuilder:validation:Minimum=1
+	// +required
+	CollectorID int32 `json:"collectorID"`


what is the value in the user providing the collector ID? Why not let OVNK choose a random value and insert it? Is it due to coordination with netobserv?

this ID is used by netobserv and the debug script, to say which observConfig we want to collect. Sometimes you may want 2 different ObservabilityConfigs (e.g. for netobserv and for debug script), then you need to distinguish them. As debug script doesn't have kube-api access it needs to know the ID

So when this CR is created, couldn't OVNK assign an ID, rather than the user having to provide one?

we could do that, but it will require a centralized (aka cluster-manager) watcher to assign IDs. Then we will need to report that ID back via status, so that it can be used with debug script.
I am not sure if this is worth an extra centralized component to automate this part, wdyt?

trozet · 2024-12-10T15:29:49Z

enhancements/network/ovn-observability-api.md

+// Currently, it supports node and namespace based filtering.
+// If both node and namespace filters are specifies, they are logically ANDed.
+// +kubebuilder:validation:MinProperties=1
+type Filter struct {


would it make sense to make the filter be on a per feature (that supports it) level?

yeah potentially. The idea is that if you have a problem on a given node or in a given namespace(s), you want to enable 100% sampling for all features on that node and/or namespace. So it is potentially more convenient to specify it once per config vs per every feature.

hmm but the sampling configuration is per feature. So I guess with this design if I want to sample a specific feature on 1 node, and another feature on another node, I would make 2 CRs?

yep, the problem here is that I don't know what kind of filtering will actually be useful, and I would like to avoid over-complicating the API if possible.
I could move namespace-based filter under FeatureConfig, but I think nodeSelector still makes more sense on the config-level (we could add node-based filtering to per-feature filter later if needed). Do you think it will be good enough for the first version?

trozet · 2024-12-10T17:33:30Z

enhancements/network/ovn-observability-api.md

+	// It only applies to the namespaced features, currently that includes NetworkPolicy and EgressFirewall.
+	// +kubebuilder:MinItems=1
+	// +optional
+	Namespaces *[]string `json:"namespaces,omitempty"`


maybe this is more of a future topic for discussion, but I was just thinking...wouldn't another good filter be on UDN? That way i know my users in network A are having problems i just want to turn on sampling for netpol in network A, which would then translate to on the backend to every network policy for namespaces where that network is active?

Yeah I think it is technically doable, and also can be considered an extension for config- or feature-wide filter

openshift-ci bot requested review from dougbtv and trozet October 7, 2024 13:54

openshift-ci bot requested a review from jotak October 8, 2024 06:53

npinaeva force-pushed the ovn-observ branch from e4c1b97 to 1c10806 Compare October 8, 2024 10:23

jotak reviewed Oct 16, 2024

View reviewed changes

enhancements/network/ovn-observability-api.md Show resolved Hide resolved

jotak reviewed Oct 16, 2024

View reviewed changes

enhancements/network/ovn-observability-api.md Outdated Show resolved Hide resolved

dceara approved these changes Nov 7, 2024

View reviewed changes

npinaeva force-pushed the ovn-observ branch from 1c10806 to cf7e598 Compare November 19, 2024 10:03

abhat reviewed Nov 21, 2024

View reviewed changes

jotak approved these changes Nov 27, 2024

View reviewed changes

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 27, 2024

Add an OVN Observability API enhancement

08b2f70

Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>

npinaeva force-pushed the ovn-observ branch from cf7e598 to 08b2f70 Compare December 2, 2024 15:41

trozet reviewed Dec 10, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an OVN Observability API enhancement #1693

Add an OVN Observability API enhancement #1693

npinaeva commented Oct 7, 2024

jotak commented Oct 8, 2024

dceara left a comment

dceara Nov 7, 2024

abhat Nov 21, 2024

npinaeva Nov 26, 2024

jotak left a comment

openshift-ci bot commented Nov 27, 2024

openshift-ci bot commented Dec 2, 2024

trozet Dec 10, 2024

npinaeva Dec 10, 2024

trozet Dec 10, 2024

npinaeva Dec 11, 2024

trozet Dec 10, 2024

npinaeva Dec 10, 2024

trozet Dec 10, 2024

npinaeva Dec 11, 2024

trozet Dec 10, 2024

npinaeva Dec 11, 2024

Add an OVN Observability API enhancement #1693

Are you sure you want to change the base?

Add an OVN Observability API enhancement #1693

Conversation

npinaeva commented Oct 7, 2024

jotak commented Oct 8, 2024

dceara left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jotak left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Nov 27, 2024

openshift-ci bot commented Dec 2, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment