Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: expose number of cached namespaces per subject #91

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sadlerap
Copy link
Collaborator

@sadlerap sadlerap commented Mar 3, 2025

This series exposes the number of namespaces a subject has in our cache. This allows us to have better metrics on how our cache is being used in live deployments.

By adding these metrics, we can add more insightful metrics to our dashboards. For instance, we can now tell if there are subjects in our cache who have a lot of cached namespaces, indicating broad access across the cluster. We can also use these metrics to get an idea of how large our cache is, by summing across all metrics and ignoring labels.

This series also does a little refactoring of the unit tests associated with the cache.

@sadlerap sadlerap force-pushed the subject-cache-metrics branch 2 times, most recently from 8a55083 to 2d0c512 Compare March 3, 2025 22:27
sadlerap added 3 commits March 4, 2025 13:21
Our testing cache is our source of truth for how many subjects we're
using in this test case.  Rather than maintaining it separately in table
declarations, calculate it from the cache during tests.

Signed-off-by: Andy Sadler <ansadler@redhat.com>
IMO it's better coding style, and gopls has an easier time reasoning
about these constants if they have explicit types.

Signed-off-by: Andy Sadler <ansadler@redhat.com>
By adding these metrics, we can have better insights into
namespace-lister's deployment in our dashboards.  For instance, we can
now get an idea of users that have broad access to namespaces.

To preserve subject information, stuff the subject information into
labels.  This will allow us to view our cache both as individual entries
and as an aggregate cache using promql queries.

During cache synchronization, we purge all existing entries and rebuild
so that stale subjects who no longer live in our cache do not get
reported in our metrics.

Signed-off-by: Andy Sadler <ansadler@redhat.com>
@sadlerap sadlerap force-pushed the subject-cache-metrics branch from 2d0c512 to 8be49fa Compare March 4, 2025 19:22
Subsystem: "accesscache",
Name: "cached_namespace_count",
Help: "number of cached namespaces",
}, []string{"apiGroup", "kind", "name", "namespace"}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}, []string{"apiGroup", "kind", "name", "namespace"}),
}, []string{"apiGroup", "subject"}),

we could reduce the number of labels this way:

  • user: user:<name>
  • serviceaccount: system:serviceaccount:<namespace>:<name>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants