Avoid admin roles in local cluster runner #2026

jackkleeman · 2024-10-03T17:06:00Z

For the time being we only want to run admin on the metadata node, which will also be allowed to bootstrap.

AhmedSoliman · 2024-10-03T17:10:06Z

crates/local-cluster-runner/examples/two_nodes_and_metadata.rs

@@ -25,7 +25,7 @@ async fn main() {
    let nodes = Node::new_test_nodes_with_metadata(
        base_config,
        BinarySource::CargoTest,
-        enum_set!(Role::Admin | Role::Worker),
+        enum_set!(Role::Worker),


Do you think we should have something in the cluster builder to specify which node is admin?

currently my thinking is that its easier to have one singleton node that has all the singleton roles (metadata, admin), and then the other N nodes can all be more similar. its always possible to specify nodes in whatever setup you like, but the goal of new_test_nodes_with_metadata is to create a list of nodes with some sensibl defaults

hmm, but i can see now that non-bootstrap nodes will fail to start if there is no nodes config yet, which is a slightly unpleasant race condition. i wonder if indeed the cluster construct needs to know about who is admin and make sure its started and healthy before moving on

Would a slightly longer timeout help in preventing this situation? I could see someone who is deploying Restate might run into the same situation that some nodes start a bit earlier than others creating a race condition.

only if your environment can restart the process. currently the non bootstrap nodes will shut down if they reach the metadata service and find no nodes config. this is fine with systemd or kubernetes which will restart on failure. but my local cluster runner does not do this

We could not immediately fail but wait a bit in order to mitigate the race condition at start-up. Alternatively, all nodes could be started with the bootstrap option, assuming they have identical configuration.

my current thinking for the runner is to wait for admins to be ready on port 9070 before progressing to other nodes. but i agree, it would be good if non admin nodes would wait a bit instead of bailing

crates/local-cluster-runner/src/node/mod.rs

Avoid admin roles in local cluster runner

b0c231a

For the time being we only want to run admin on the metadata node, which will also be allowed to bootstrap.

jackkleeman requested a review from AhmedSoliman October 3, 2024 17:06

AhmedSoliman reviewed Oct 3, 2024

View reviewed changes

jackkleeman force-pushed the avoid-admin branch from cca482d to 19cfad6 Compare October 3, 2024 17:30

Check admins are ready before bringing up more nodes

1dcf712

jackkleeman force-pushed the avoid-admin branch from 032c0b5 to 1dcf712 Compare October 4, 2024 09:47

jackkleeman requested a review from AhmedSoliman October 4, 2024 09:49

AhmedSoliman reviewed Oct 7, 2024

View reviewed changes

crates/local-cluster-runner/src/node/mod.rs Show resolved Hide resolved

AhmedSoliman approved these changes Oct 7, 2024

View reviewed changes

Fix doc comments

c9ec4f0

jackkleeman merged commit 060a5ed into restatedev:main Oct 7, 2024
8 checks passed

jackkleeman deleted the avoid-admin branch October 7, 2024 08:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid admin roles in local cluster runner #2026

Avoid admin roles in local cluster runner #2026

jackkleeman commented Oct 3, 2024

AhmedSoliman Oct 3, 2024

jackkleeman Oct 3, 2024

jackkleeman Oct 3, 2024

tillrohrmann Oct 4, 2024

jackkleeman Oct 4, 2024

tillrohrmann Oct 4, 2024

jackkleeman Oct 6, 2024

Avoid admin roles in local cluster runner #2026

Avoid admin roles in local cluster runner #2026

Conversation

jackkleeman commented Oct 3, 2024

AhmedSoliman Oct 3, 2024

Choose a reason for hiding this comment

jackkleeman Oct 3, 2024

Choose a reason for hiding this comment

jackkleeman Oct 3, 2024

Choose a reason for hiding this comment

tillrohrmann Oct 4, 2024

Choose a reason for hiding this comment

jackkleeman Oct 4, 2024

Choose a reason for hiding this comment

tillrohrmann Oct 4, 2024

Choose a reason for hiding this comment

jackkleeman Oct 6, 2024

Choose a reason for hiding this comment