Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gRPC Health Check must be implemented in the distributor #4657

Open
kworkbee opened this issue Feb 4, 2025 · 12 comments
Open

gRPC Health Check must be implemented in the distributor #4657

kworkbee opened this issue Feb 4, 2025 · 12 comments

Comments

@kworkbee
Copy link

kworkbee commented Feb 4, 2025

Is your feature request related to a problem? Please describe.
I want to pass traces (coming in from outside K8s cluster) to the distributor through AWS ALB with gRPC as backend protocol. In this case, ALB Health Check related annotations should be specified in Ingress but if the backend protocol is gRPC, health check method should also return the status code based on gRPC.
However, the status code currently returned by the /ready HTTP endpoint is an HTTP status code, so there is a problem that the gRPC-based ALB cannot be configured at the front-end of the distributor.

Image Image

Describe the solution you'd like
A standard gRPC Health Endpoint (which conforms to standard proto) should be added into distributor.

@luvpreetsingh
Copy link

I was looking for the same and landed at this issue.

@luvpreetsingh
Copy link

So, is there no way of doing it as of now?

I am facing the same issue but on GCP.

Image

@joe-elliott
Copy link
Member

The gRPC service running on port 4317 is here: https://github.com/open-telemetry/opentelemetry-collector/tree/main/receiver/otlpreceiver

I'm not sure we can add a healthcheck endpoint to it. I have been wanting to write our own otel receiver native into tempo for awhile now which would allow us to do this, but that's on the backburner.

@joe-elliott
Copy link
Member

We do register gRPC health checks to the standard gRPC server in the distributor. Unsure if you can point the ALB to that:

tempo/cmd/tempo/app/app.go

Lines 197 to 201 in 3d50193

grpc_health_v1.RegisterHealthServer(t.Server.GRPC(),
grpcutil.NewHealthCheckFrom(
grpcutil.WithShutdownRequested(shutdownRequested),
grpcutil.WithManager(sm),
))

@joe-elliott
Copy link
Member

This issue appears relevant: open-telemetry/opentelemetry-collector#3040

@luvpreetsingh
Copy link

We do register gRPC health checks to the standard gRPC server in the distributor

@joe-elliott Would you be kind enough to share grpcurl for the distributor health check call?

@mapno
Copy link
Member

mapno commented Feb 5, 2025

We do register gRPC health checks to the standard gRPC server in the distributor

@joe-elliott Would you be kind enough to share grpcurl for the distributor health check call?

The server does not support reflection, so you'll need to have the healthcheck proto somewhere locally if you want to query it with grpcurl

grpcurl -v -plaintext -import-path <path> -proto google/grpc/health/v1/health.proto localhost:9095 grpc.health.v1.Health/Check

@luvpreetsingh
Copy link

Looks like GCP doesn't support GRPC health checks

https://cloud.google.com/kubernetes-engine/docs/how-to/ingress-configuration#direct_health

@luvpreetsingh
Copy link

This is confusing to me. We can't use GRPC ingestion on Grafana Tempo with the current implementation. Am I missing something really basic or is tempo missing something?

@luvpreetsingh
Copy link

luvpreetsingh commented Feb 5, 2025

For the OTLP HTTP endpoint, is there any health check endpoint available in the distributor?

I can give the HTTP health check endpoint and trick the LB.

I tried using the readinessProbe as health check, which is https://3100/ready, but the issue is that doing HTTPS health check with GCP LB is tricky

@kworkbee
Copy link
Author

kworkbee commented Feb 7, 2025

We do register gRPC health checks to the standard gRPC server in the distributor. Unsure if you can point the ALB to that:

tempo/cmd/tempo/app/app.go

Lines 197 to 201 in 3d50193

grpc_health_v1.RegisterHealthServer(t.Server.GRPC(),
grpcutil.NewHealthCheckFrom(
grpcutil.WithShutdownRequested(shutdownRequested),
grpcutil.WithManager(sm),
))

Hi, @joe-elliott Thanks for your response. I missed the implementation of app.go, so I tried to port-forward traffic port (4317) and call grpcurl as below.

grpcurl -v -plaintext -import-path ./ -d '{"service":""}' -proto health.proto localhost:4317 grpc.health.v1.Health/Check                                                         

I saw that it's coming out as 'UNIMPLEMENTED', but I don't know if I called it correctly. (Sorry but I'm not used to gRPC and golang enough yet)

Resolved method descriptor:
// If the requested service is unknown, the call will fail with status
// NOT_FOUND.
rpc Check ( .grpc.health.v1.HealthCheckRequest ) returns ( .grpc.health.v1.HealthCheckResponse );

Request metadata to send:
(empty)

Response headers received:
(empty)

Response trailers received:
content-type: application/grpc
Sent 1 request and received 0 responses
ERROR:
  Code: Unimplemented
  Message: unknown service grpc.health.v1.Health

@joe-elliott
Copy link
Member

joe-elliott commented Feb 7, 2025

Port 4317 is the OTel gRPC receiver. This service is not managed by us and I don't believe we can add a health check here.

By default port 9950 is the native Tempo gRPC receiver. That should have a health check server registered.

The linked OTEL issue above contains work arounds folks have used to get the OTel collector working with an ALB which may help you as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants