-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-11966. Add recon support #16
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @fraochan. It mostly looks good. I have a few comments.
- name: WAITFOR | ||
value: {{ $.Release.Name }}-scm-0.{{ $.Release.Name }}-scm-headless:{{ .Values.scm.service.port }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider if waiting for another pod is strictly necessary.
If we do have to wait then we should do it in an initcontainer.
Also, we should use the SCM service instead of addressing the pod directly. At any point we have to consider the pod -scm-0
being down and should instead rely on other SCM pods (if any in HA mode) by using the SCM service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was inspired by the Apache Ozone Kubernetes examples of the source repository. But indeed, I agree with you.
Recon runs in a loop until it can communicate with the SCM headless service. In my opinion, deleting this part is enough.
Otherwise, I'll do an initContainer
(although I'm not sure yet what I'll do as a check to make Recon wait, probably not too hard but I haven't thought about it yet).
{{- $tolerations := or .Values.recon.tolerations .Values.tolerations }} | ||
{{- $securityContext := or .Values.recon.securityContext .Values.securityContext }} | ||
apiVersion: apps/v1 | ||
kind: StatefulSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider making this pod stateless by using a Deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have been more confident, that was my initial suggestion. But in line with the rest of the Helm Chart, I've made it a Statefulset.
So I completely agree with you and have made the changes.
By the way, I'd say the S3 gateway is also stateless and could be a deployment, couldn't it?
charts/ozone/templates/_helpers.tpl
Outdated
@@ -73,4 +78,8 @@ app.kubernetes.io/instance: {{ .Release.Name }} | |||
value: "1" | |||
- name: OZONE-SITE.XML_dfs.datanode.use.datanode.hostname | |||
value: "true" | |||
{{- if .Values.recon.enabled }} | |||
- name: OZONE-SITE.XML_ozone.recon.address | |||
value: {{ include "ozone.recon.pod" . }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should point to the Recon service name instead of the pod.
We should be able to refer to the recon pod without specifying the type of the pod. This is where the service comes into the picture.
ozone.recon.pod
defined in this file above can be removed if not needed elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had to add the rpc
port to the service as well, but yes,we might use the service.
84eb444
to
6ce95ba
Compare
@ptlrs, thanks for the constructive review, I've modified the code accordingly and I'm ready for another one :) |
What changes were proposed in this pull request?
This PR adds the files required for an optional Recon.
Recon does not require a persistent volume. I hesitated to do a deployment, but the Apache Ozone Git project suggests a statefulset as an example. I don't mind converting it to a deployment if you find it necessary.
What is the link to the Apache JIRA
HDDS-11966
How was this patch tested?
I created a local cluster with kind, then :
Recon
:aws-cli
:And I made sure that the bucket appeared here: localhost:9888/#/Buckets
And finally, I've deleted Recon