-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add liveness and readiness probes to API deployment #256
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @droctothorpe! I added the health check route, but completely forgot to add it to the chart. Good catch.
LGTM. Thanks! |
I'm not an expert (by any means) but I've read that you have to be cautious with using liveness probes: Not sure if it's relevant here (prob depends on the implementation of the |
Yeah, the important thing is that checks for the liveness probe don't rely on any downstream services, they only check the health of the pod itself. Otherwise you run the risk of cascading failures. We're fine here in this regard. |
initialDelaySeconds: {{ .Values.gateway.livenessProbe.initialDelaySeconds }} | ||
periodSeconds: {{ .Values.gateway.livenessProbe.periodSeconds }} | ||
timeoutSeconds: {{ .Values.gateway.livenessProbe.timeoutSeconds }} | ||
failureThreshold: {{ .Values.gateway.livenessProbe.failureThreshold }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reduce the issues of working with Helm's templates, I suggest opting for doing something more like this, and moving httpGet to values.yaml, removing enabled: true
in favor of letting it being declared or not indicate if it should be included or not.
Hmm... Or maybe not if switching it on and off is a important matter to make it easy. But I wouldn't say it is.
If everything is to be made confiugrable in a Helm chart, it becomes messy otherwise, and if not, it becomes a big driver of new issues from users.
{{- with .Values.gateway.livenessProbe }}
livenessProbe:
{{- .Values.gateway.livenessProbe | toYaml | nindent 12 }}
{{- end }}
Idiomatically, production pods should leverage liveness probes. Liveness probes ensure that pods are restarted if the application is unresponsive but the main process is still running.
Dask Gateway Server includes a health check endpoint, but kubelet isn't checking it.
This PR updates the Helm deployment for Dask Gateway Server to include liveness and readiness probes.
I tested this on my local machine using Docker Desktop Kubernetes and confirmed that everything works.
Major kudos to @suchitpandya for identifying this discrepancy.