-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nrt: cache: add support for dedicated informer #599
nrt: cache: add support for dedicated informer #599
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ffromani The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@Huang-Wei @dankensk PTAL when you have time. Would be adding a separate, optional, disabled by default informer acceptable? (please check the related issue for context) I'm also investigating from kubelet side about if/how terminal pods should be reported. |
/cc @swatisehgal @Tal-or |
64e5752
to
ed541c5
Compare
✅ Deploy Preview for kubernetes-sigs-scheduler-plugins ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
ed541c5
to
303b02e
Compare
@Huang-Wei @zwpaper since I mentioned this PR when reviewing #655, any design objections from you folks? note this is still on hold as we're still checking if the fix is better done on kubelet or not. But would this PR raise concern from the scheduling side? |
303b02e
to
0386051
Compare
gentle ping :) no major objections I guess? |
assuming no design blockers, @Tal-or @swatisehgal PTAL |
Not yet but we're heading on the direction this is a bug on the kubelet side. It will need better docs (and agreement about the desired state) anyway, discussion need to be revamped.
That's a very good point. If everything goes well, the kubelet fix will land in 1.30. This change will ensure scheduler compatibility with older versions (any older supported version) which seems like a reasonnable price to pay - assuming there are no other design concerns. |
0386051
to
b0e265f
Compare
hi @ffromani sorry for the late reply, I was really busy on those days. I was not able to get some time to look into the changes, it would be better for other reviewers to take it. |
gentle ping @swatisehgal @Tal-or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience and your work on this!
I am happy to proceed with the current approach as the introduction of informer mode is a nice addition to current implementation and provides us a way to resolve the inconsistency in the pod data obtained from podresource API and that visible to scheduler.
But, I do feel that in the long term resolving this on the kubelet side would be a more appropriate fix as exposing resources of pods that are terminal does not seem appropriate to me. Irrespective of how we go about things from kubelet point of view, I think we are a way forward to handle this for now as well as in the future with some minor adjustments to default mode if needed.
I left comments suggesting a minor addition to tests and why IMO we should reevaluate the current default value of the mode.
Thank you for working on this, LGTM (not adding official label since I see there're other comments) |
Thank you for the valuable feedback. You raise (and raised, now that I remember) a great point that makes me still torn about this approach. The problem is indeed the represantation of completed pod. The core issue is: a completed (either failed or succeeded) pod should accounted for? IOW, should resources accounted to completed pods be reused if need arises? There are valid points for both approaches. What is more and more evident, anyway is the fact I'm still ruminating about the merit of the approach. This on itself is a strong indicator we should NOT bake in this workaround in the main codebase. Should the workaround be needeed, there are ways to add in the 3rd party codebases. At the same time, out of practicality, the issue we're trying to fix here is real and affect real-world deployments. |
IMO the answer to your question should be NO (in the long term) but I understand that we are not in a perfect world and we can't achieve that immediately. Solving this in the scheduler obviously brings us to a state better than we are currently at. To be a bit more explicit, the way I see this working out in my head is as follows:
Just want to state that I am happy with the PR in its current form with some minor updates but we can totally explore a more streamlined PR if you think that would be better. |
Thanks for sharing your insights! I was contemplating the best course of action based on your feedback. All of this is bad UX at best, and a crippling blow to the plugin functionality in all the scenarios. The best course of action would be agree about a standard in the kubernetes, or at least node, community, but this is likely to happen anytime soon - kubernetes/kubernetes#119423 got little traction to begin with, and in all fairness this is a nontrivial effort. This will lead us to a state on which the plugin functionality is severely diminuished until the oldest support version (current-2 if not mistaken) is fixed, which will takes at least months for now. Because all of these reasons, and because reviewers and maintainers are at worst neutral if not mildly positive out of practical matters about this change, I'm removing the hold and moving forward. Granted, we should, no we must keep working toward a conclusive and clean fix. This change can't be the long term solution, even though "temporary" solutions tend to stick around forever. Let's steer the boat for once. |
/unhold begrudgingly. It's a workaround. But we can't keep a cripple plugin until we have a N-2 fixed version. We will talk about at very least kube 1.32 |
Add an option to create and use a separate informer to get unfiltered pod listing from the apiserver. Due to mismatched view with respect to the kubelet, the plugin needs access to pod in terminal phase which are not deleted to make sure the reconciliation is done correctly. xref: kubernetes-sigs#598 xref: kubernetes/kubernetes#119423 Signed-off-by: Francesco Romani <fromani@redhat.com>
b0e265f
to
e466bc4
Compare
addressed all comments and fixed a subtle bug on which the pods were not filtered according to the new option value. Now they are. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold |
Looks like the bot was faster. We can revert if needed. |
Ah! Didn't expect that! Next time will be more careful. @Huang-Wei Please let us know if you have concerns here. |
No concern from my side. |
What type of PR is this?
/kind bug
/kind api-change
What this PR does / why we need it:
add optional dedicated informer for the NodeResourceTopology scheduler plugin.
For its reconciliation mechanism to converge, the plugin needs to list pods in terminal phase which are not deleted yet
Which issue(s) this PR fixes:
Fixes #598
Special notes for your reviewer:
Technically API change but it is backward compatible (optional fields) and done like many others we did in the past. Still labeled for transparency
Does this PR introduce a user-facing change?