Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: support ranges of shards in VTOrc --clusters_to_watch #17537

Open
timvaillancourt opened this issue Jan 15, 2025 · 7 comments
Open
Assignees
Labels
Component: VTorc Vitess Orchestrator integration Type: Enhancement Logical improvement (somewhere between a bug and feature)

Comments

@timvaillancourt
Copy link
Contributor

timvaillancourt commented Jan 15, 2025

Feature Description

As VTOrc user with many large keyspaces and often-splitting shards, I would like the ability to watch a range of keyspace-shards without having to individually define them in --clusters_to_watch and restart all VTOrcs whenever they change

Or as a TL;DR, I would like to tell one VTOrc instance to "watch the first-half of the foo keyspace" (or -80 in Vitess terms) and another to "watch the second-half of the foo keyspace" (or 80- in Vitess terms).

As some background, VTOrc's --clusters_to_watch flag allows VTOrc to watch a subset of keyspaces or shards. Changes to this flag require a restart of VTOrc to be applied. If VTOrc is watching a single keyspace (lets say --clusters_to_watch foo) with more tablets than can be reliably watched, the seemingly only option to scale is to partition/distribute the shards watched across many VTOrc instances. This approach creates a problem, however: now every shard needs to be statically defined in --clusters_to_watch. What if you add/split or merge a shard? More updates/deploys 👎. What if you forgot to add one too?

This feature request proposes --clusters_to_watch (or a new flag) allows the user to watch a range of keyspace/shards, for example specifying foo/-80 would allow VTOrc to watch "half of the foo keyspace" (ie: all shards from - to 80)

cc @GuptaManan100 for 💡s 🙇

Use Case(s)

A VTOrc user with many large keyspaces and often-splitting shards that would like to avoid statically defining (and redeploying) --clusters_to_watch for every shard

@timvaillancourt timvaillancourt added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: VTorc Vitess Orchestrator integration labels Jan 15, 2025
@timvaillancourt timvaillancourt self-assigned this Jan 15, 2025
@timvaillancourt
Copy link
Contributor Author

timvaillancourt commented Jan 15, 2025

This feature request proposes --clusters_to_watch (or a new flag) allows the user to watch a range of keyspace/shards, for example specifying foo/-80 would allow VTOrc to watch "half of the keyspace" (ie: all shards from - to 80)

To play devil's advocate with myself here, range support in the existing --clusters_to_watch flag could become confusing. Is foo/-80 specifying a single shard or a range of shards? It would not be clear at face value 🤔

Also what if there WAS a -80 shard but later it is split and removed; this would add some unpredictably as VTOrc would go from watching a single shard to many unexpectedly

A new flag for specifying ranges, or a boolean-flag to enable range support might make this more predictable for users

@deepthi
Copy link
Member

deepthi commented Jan 15, 2025

How about making this flag dynamic and providing it in a config file? It would then not require a restart.
Some potential complications

  • it's not just a matter of changing some constant parameter. when the config value changes, we have to throw away all data for the old shards from the sqlite database as well
  • @GuptaManan100 is working on a watch based discovery mechanism vs a poll-based discovery mechanism, so we'd have to stop watches on old shards and add them on new ones.

@derekperkins
Copy link
Member

Messaging has a KeyRange option for subsetting shards. Maybe that's useful prior art?
https://vitess.io/docs/21.0/reference/features/messaging/#subsetting

@GuptaManan100
Copy link
Member

GuptaManan100 commented Jan 16, 2025

Here are my thoughts -
I think this would be a definite improvement to what we have now. If someone has 128 shards and decides to reshard to 256, it is a little annoying to now list the 128 shards out explicitly.

I don't think we'd have any trouble with ambiguity because we can make the custom to still accept a list of shard ranges, but we would internally change to watch shards that are a subset of what's specified too. What I mean is in the example I took above, if the user just says -80 it would make VTOrc watch all the first half since each of those shards is in this shard range defined.

The only perceivable problem that I think can happen is if the users misconfigure their VTOrcs to have a shard range that doesn't exactly match the shard splits. For example, if there are only 8 shards, and the users give something like -70 to one VTOrc and 70- to another. From the outset, it would look like we are covering the entire range, but the shard 60-80 won't fall in either of the two VTOrc's scope.

And I agree with @deepthi, if we make the flag dynamic, it won't be all that easy to figure out what all we need to invalidate and keep, and I think requiring a restart when this value changes is fine because reshards, etc aren't operations that happen too often. If we make the shard range change being discussed, then this won't be much of an issue anyway.

Also, @timvaillancourt as Deepthi pointed out, I am working on a change that shards a Watch on the topo server instead of polling. I have picked up the Poll Model item listed at #17330. I am making progress on this front and should have a PR by the end of the week 🤞. We should do this change after that change has landed, because it would change how we would implement this.

EDIT: - I have created a draft PR with the changes I have now so that you can see what I am trying to do and where I'm at - #17553

@timvaillancourt
Copy link
Contributor Author

timvaillancourt commented Jan 16, 2025

@deepthi / @derekperkins / @GuptaManan100 thanks for the great feedback! It's exciting to hear about this other work going on re: a watch model 🎉

How about making this flag dynamic and providing it in a config file? It would then not require a restart.

For our use case the need to restart isn't much of a problem if we had support for ranges, which avoid the need for a restart every time a shard is added/merged/split. The cases of where a restart is required is reduced, at least, but not eliminated

So maybe the config-value being dynamic is a bit out of scope of this issue, but no concerns tackling both at the same time

it's not just a matter of changing some constant parameter. when the config value changes, we have to throw away all data for the old shards from the sqlite database as well

Good point, I think starting small with just range support in the existing, static --clusters_to_watch would unblock at lot and avoid some of the risks of a dynamic value

The only perceivable problem that I think can happen is if the users misconfigure their VTOrcs to have a shard range that doesn't exactly match the shard splits. For example, if there are only 8 shards, and the users give something like -70 to one VTOrc and 70- to another. From the outset, it would look like we are covering the entire range, but the shard 60-80 won't fall in either of the two VTOrc's scope.

Yes, we consider this an assumed risk of using --clusters_to_watch in general. We've mitigated this risk by writing a periodic task to check all shards are "watched by someone"

We should do this change after that change has landed, because it would change how we would implement this.

@GuptaManan100 sounds good! We're running a POC of this idea already successfully, so I'll work on making that more upstreame-able and the PR can come after yours 👍.

EDIT: - I have created a draft PR with the changes I have now so that you can see what I am trying to do and where I'm at - #17553

👀 🙇

EDIT: @GuptaManan100 here is the POC logic I'm working on for range support in --clusters_to_watch: https://github.com/slackhq/vitess/blob/448291cd8a95f95542fadd29aa03dfb019ad2f71/go/vt/vtorc/logic/tablet_discovery.go#L85-L161

@GuptaManan100
Copy link
Member

GuptaManan100 commented Jan 16, 2025

Hello @timvaillancourt! That sounds wonderful! Just one issue, the link for the POC logic leads to a 404 error. Maybe it's not accessible for anyone without access to slackhq?

@timvaillancourt
Copy link
Contributor Author

Hello @timvaillancourt! That sounds wonderful! Just one issue, the link for the POC logic leads to a 404 error. Maybe it's not accessible for anyone without access to slackhq?

@GuptaManan100 ack, 404 resolved I believe: https://github.com/slackhq/vitess/blob/448291cd8a95f95542fadd29aa03dfb019ad2f71/go/vt/vtorc/logic/tablet_discovery.go#L85-L161

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: VTorc Vitess Orchestrator integration Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

No branches or pull requests

4 participants