Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to listen to specific services instead of the whole catalog #566

Open
fuleow opened this issue Jun 13, 2019 · 17 comments
Open

Comments

@fuleow
Copy link

fuleow commented Jun 13, 2019

spring-cloud-consul provides a Consul Catalog Watch that publishes heartbeat events on catalog changes. In an environment with many services the catalog can change rapidly (multiple times per second) causing heartbeat events to trigger for services which the application is not interested in.

For example, this mechanism is used in Spring Cloud Config Client if discovery is enabled link

In practice the config client is only interested in updates to the spring-cloud-config-server but this triggers each time the catalog updates.

The catalog services watch can already be disabled. It would be very useful if an alternative heartbeat event producer can be implemented which takes a list of relevant services and only publishes heartbeat events when those specific services are updated.

Currently any micro service in our organization will start generating many requests to consul just by adding spring-cloud-starter-consul-discovery. While the watch-delay is configurable it's less than ideal if your application is only interested in a subset of services. If the watch-delay is too high you risk not getting an immediate update when the services changes and if it's too low you get flooded with events.

@spencergibb
Copy link
Member

The consul api we use does not provide filtering. https://www.consul.io/api/catalog.html#list-services

@fuleow
Copy link
Author

fuleow commented Jun 13, 2019

The consul api we use does not provide filtering. https://www.consul.io/api/catalog.html#list-services

Yes it doesn't provide filtering on the entire catalog. However, long polling watches can be created for each service that the application is interested in. The mechanism can be very similar to how it's done now with a background thread for each service.

https://www.consul.io/api/catalog.html#list-nodes-for-service

@spencergibb
Copy link
Member

Having to explicitly list every service you want to watch doesn't seem scalable to me.

@fuleow
Copy link
Author

fuleow commented Jun 13, 2019

I think that applications which have many upstream services can continue using the existing catalog watch mechanism. However it would be nice if microservices which only have a couple of interfaces be able to watch those they are interested in instead of the entire catalog.

Here is some data on our own consul index and how quickly it changes

while true; do curl -s -v http://consul:8500/v1/catalog/services 2>&1> /dev/null | grep Index; sleep 5; done
< X-Consul-Index: xxxx59768
< X-Consul-Index: xxxx59896
< X-Consul-Index: xxxx59961
< X-Consul-Index: xxxx60067
< X-Consul-Index: xxxx60243
< X-Consul-Index: xxxx60415
< X-Consul-Index: xxxx60548
< X-Consul-Index: xxxx60743
< X-Consul-Index: xxxx60774

@spencergibb
Copy link
Member

I think that is a rarity. As you mentioned, you can disable it and roll your own. We'll wait to see if there are more folks who want this.

@alex-dubrouski
Copy link

Good afternoon,
Not sure what do you mean by rarity. It is very common to have hundreds of microservices and use continuous deployment strategies in huge organizations. May be you are operating at a different scale, but we have multiple datacenters and Consul catalog contains thousands of tags.
To be precise we started investigating high CPU/Memory/Network usage of some idle microservices running in the cloud. I profiled couple of them and found that they are all affected by the same problem. Spring schedules a task which continuously polls Consul and fetches updates (due to high rate of changes it happens every couple of seconds) it results in heavy memory consumption by underlying "com.ecwid.consul" library. This results in frequent garbage collections and high CPU usage.
jprofiler
Here is JProfiler hot spot allocation report. In reality absolutely idle service continuously uses around 400MB of memory. I am sorry but I think this implementation does not scale. We are currently suspended usage of it and looking for other ways to implement service discovery.

@spencergibb
Copy link
Member

@alex-dubrouski please open a separate issue as that does not seem related to ConsulCatalogWatch. We'd be happy to entertain ways to make things more efficient as no one has reported anything similar.

What doesn't seem to be scalable is to have to manually add a service that needs to be watched for the OPs use case.

@alex-dubrouski
Copy link

Spencer,
Please review the attached screenshot again. ConsulCatalogWatch schedules task which fetches catalog here:
https://github.com/spring-cloud/spring-cloud-consul/blob/master/spring-cloud-consul-discovery/src/main/java/org/springframework/cloud/consul/discovery/ConsulCatalogWatch.java#L129
When rate of changes of Consul catalog is high and catalog itself is huge it results in this high memory allocation / CPU usage. Memory allocated for CatalogConsulClient.getCatalogServices()on attached picture is 289MB

@spencergibb
Copy link
Member

Yes, but the image shows the trace coming thru the /health actuator endpoint. While they both call the same method, I think it's still a different situation.

Might be worth setting management.health.consul.enabled=true.

@alex-dubrouski
Copy link

Only small percentage of calls is coming from health services, most of them is a result of discovery.

@spencergibb
Copy link
Member

if that's the case, you can use tags to limit what is returned.

@alex-dubrouski
Copy link

Yes, this is what we actually asked to implement. Currently ConsulCatalogWatch uses /catalog/services endpoint which does not provide filtering and does not accept tags (you said it yourself here #566 (comment)).
Fu asked to implement support for watching specific services as second possible path.

@spencergibb
Copy link
Member

spencergibb commented Jun 14, 2019

Do you both work together? For ConsulCatalogWatch there hasn't been a request for this except for you. It may be worth implementing something on your own, seeing how it works and submitting a PR. #475 may be interesting to you as well.

@varnson
Copy link
Contributor

varnson commented Jun 15, 2019

I had added waitTime and index in getServers() of ConsulServerList.java in my company.
This changed getHealthServices() to be a blocking query.
I also want to add cached parameter in it.
I think the catalogwatch was useless, and you should use getHealthServices() to watch the server list.@fuleow

@alex-dubrouski
Copy link

I am sorry for delay, yes we work together on one team. We will try to incorporate this research into our plans. BTW Fu already has an open PR for this repo.

@fuleow
Copy link
Author

fuleow commented Jun 17, 2019

Do you both work together? For ConsulCatalogWatch there hasn't been a request for this except for you. It may be worth implementing something on your own, seeing how it works and submitting a PR. #475 may be interesting to you as well.

Thanks @spencergibb I'll try to explore implementing something custom when I have some time. For the PR @alex-dubrouski mentioned I have that running on our Spring Boot Admin server and with catalog-services-watch-delay set to 10000 it's been working ok so far.

@rutuls
Copy link

rutuls commented Mar 5, 2021

I need to implement the scenario in service1 where I want to know health of service2 and if it is up/running then will take action accordingly. I want to know the health of a service on a separate thread or basically in a non-blocking way. How do I implement it using catalog watch ? Is there any other way to implement as well ? Are there any examples available ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants