-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[POC] OmniPaxos and Raft based embedded metadata store #2376
Draft
tillrohrmann
wants to merge
22
commits into
restatedev:main
Choose a base branch
from
tillrohrmann:omnipaxos
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+5,510
−1,756
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tillrohrmann
force-pushed
the
omnipaxos
branch
from
December 6, 2024 15:29
11a6fdb
to
7f3be27
Compare
After starting the metdata store service and the grpc server, the node will try to initialize itself by joining an existing cluster. Additionally each node exposes an provision cluster grpc call with which it is possible to provision a cluster (writing the initial NodesConfiguration, PartitionTable and Logs). Nodes can only join after the cluster is provisioned. This fixes restatedev#2409.
tillrohrmann
force-pushed
the
omnipaxos
branch
2 times, most recently
from
January 2, 2025 19:47
d246934
to
b10c0b3
Compare
This commit makes it configurable which metadata will be run by the Node when starting the Restate server.
This commit adds the skeleton of the Raft metadata store. At the moment only a single node with memory storage is supported. This fixes restatedev#1785.
The raft metadata store does not accept new proposals if there is no known leader. In this situation, request failed with an internal ProposalDropped error. This commit changes the behavior so that a ProposalDropped error will be translated into an unavailable Tonic status. That way, the request will get automatically retried.
This commit adds RocksDbStorage which implements raft::Storage. The RocksDbStorage is a durable storage implementation which is used by the RaftMetadataStore to store the raft state durably. This fixes restatedev#1791.
The OmniPaxos metadata store stores its state in memory.
This commit introduces the ability to specify multiple addresses for the metadata store endpoint. On error, the GrpcMetadataStoreClient randomly switches to another endpoint. Moreover, this commit makes the OmniPaxosMetadataStore only accept requests if it is the leader. Additionally, it fails all pending callbacks if it loses leadership to avoid hanging requests if the request was not decided.
The Restate version enables OmniPaxos to run with a single peer.
tillrohrmann
force-pushed
the
omnipaxos
branch
from
January 6, 2025 10:47
b10c0b3
to
14ee410
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This WIP pr adds support for an OmniPaxos based embedded highly available metadata store. We are using Harald's OmniPaxos library. This PR is based on an older WIP pr which added support for a Raft based metadata store because both components need a networking component and a state machine which I could reuse. One can see that the implementation of both variants is quite similar.
If you want to try it out, then you can use these configuration files. If you want to try out the Raft based metadata store, then change the metadata-store type to
"raft"
.Both implementations have a persistent log storage implementation based on RocksDb. This should allow them to be killed and restarted w/o losing data.
A few notable things which are missing:
What is quite ugly is how the participating peers (Raft as well as OmniPaxos) need to be explicitly configured atm via
In the future, we probably can add tooling to start a cluster with a single metadata peer and then extending the set of peers via
restatectl
to reach the required metadata peers size.Ideally, I would have loved to reuse Restate's
networking
component. However, because of the node validation wrtNodesConfiguration
this wasn't possible. That's why I added a very simple one.