Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collect lustre rpc stats #1538

Open
baallan opened this issue Nov 26, 2024 · 1 comment
Open

collect lustre rpc stats #1538

baallan opened this issue Nov 26, 2024 · 1 comment

Comments

@baallan
Copy link
Collaborator

baallan commented Nov 26, 2024

Lustre has rpc data in files:

/proc/fs/lustre/mdc/*-mdc-*/rpc_stats
/proc/fs/lustre/osc/*-osc-*/rpc_stats

particularly of interest are

read RPCs in flight
write RPCs in flight
pending write pages
pending read pages
modify_RPCs_in_flight

though the histograms in the same files are not of interest for periodic collection.
The in flight numbers give some indication of what servers are highly loaded, even if (or especially if?) token bucket filters are active and prevent high byte count/operation count from being visible.
We need to think about the best way to create a sampler for these metrics.

@baallan
Copy link
Collaborator Author

baallan commented Dec 2, 2024

@morrone do you have any thoughts on this issue? in particular we're looking for metrics that would provide evidence (from the client side) of lustre delays induced by token bucket filter activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant