Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore!: separate internal and CLI configurations #3357

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fryorcraken
Copy link
Collaborator

@fryorcraken fryorcraken commented Apr 4, 2025

Description

Split WakuNodeConfig object for better separation of concerns and to introduce a tree-like structure to configuration.

Notes:

  • decoupling does increase the codebase, but it seems ultimately necessary, keen to hear opinion of WakuNodeConf usage in c-bindings for examples
  • I am trying to NOT implement added improvements as I go along, because the PR is already huge. But I expect to do after that:
    • get all configs closer to their users (ie relay conf defined by relay protocol)
    • tidy up the sharding/autosharding conf logic (avoid 1024 default, make it clearer what content-topics do, etc)
    • some tidy up on types (usage of uint instead of int)
    • all the TODO introduced in this PR and my previous one feat: introduce preset option #3346
    • nest or use enum on RlnRelayConf so that it's clear what parameters are needed for dynamic or static modes
    • use --preset in nwaku-compose

Changes

  1. Use of Option to avoid dummy defaults. Also, nesting configurations such as RlnRelayConf so that parameters need to be set only if the parent service is enabled; later, it opens the option for each protocol to define their own configuration and then be importer by WakuConf
  2. WakuConf validation logic is better contained; also see (1) where each protocol defining their config parameters can define their own logic; currently, validation logic are scattered between node_factory.nim, waku.nim, etc
  3. the WakuConfBuilder becomes to stop gap shop to build a valid config. It is lenient/flexible (everything is an option). This will enable WakuNodeConf to be translated to a config.
  4. all the above mean that when creating a new libwaku based binary, a minimal CLI Args can be created, and WakuConfBuilder can be used; no need to use the giant WakuNodeConf anymore.
  5. Decoupling WakuNodeConf and WakuConf can also simplify things. for example discv5Only can be a WakuNodeConf only property, that translates to the right parameters on WakuConf (e.g. relay: false, store: false, etc)
  6. Tests! Tests being added to confirm config behaviour (more to be added)
  7. This also enables the creation of edge mode or service mode as per message API, as it will be easy to create an easy CLI/builder option for those, and convert it to the right WakuConf setup.
  8. WakuConfBuilder will provide sane defaults, so that it's easy to build binaries using nwaku (see how chat2.nim is cluttered with config irrelevant to chat). What does not have a default because dev has to choose (eg clusterId), should be provided by a preset/ClusterConf
  9. Removed some CLI args that were not used internally

How to test

  1. use wakunode2 with various CLI args and confirm expected behaviour.
  2. confirm nwaku-compose would work on TWN with this

@fryorcraken fryorcraken force-pushed the separate-node-config branch 3 times, most recently from 0d18e4f to b09f19b Compare April 8, 2025 05:49
wsExtAddress = some(
ip4TcpEndPoint(extIp.get(), wsBindPort.get(DefaultWsBindPort)) &
wsFlag(wssEnabled)
)
except CatchableError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we just catch that exception in one place in wider try-catch? Seems this function distracted with this exception handling.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll review that for a follow-up PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like an improvement unrelated to my changes so I'll keep it for a follow-up PR

Copy link
Collaborator

@Ivansete-status Ivansete-status left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is getting awesome! Thanks so much for it! 💯
I'm just adding some nitpick comments that I hope you find useful.
Also, not approving yet as I guess there will be more changes in the future.

Comment on lines 69 to 72
# applyPresetConfiguration(wakuNodeConf, confBuilder)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd avoid adding this comment

Suggested change
# applyPresetConfiguration(wakuNodeConf, confBuilder)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's a WIP


## Then
let conf = res.get()
assert conf.validate().isOk()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? :)

Suggested change
assert conf.validate().isOk()
let resValidate = conf.validate()
assert resValidate.isOk(), $resValidate.error

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes definitely. Thanks, corrected.

) == clusterConf.discv5BootstrapNodes

if clusterConf.rlnRelay:
assert conf.rlnRelayConf.isSome
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick comment: in "verbs" statements the use of parenthesis is the tendency
Only in "nouns" we avoid parenthesis, such as in len

Suggested change
assert conf.rlnRelayConf.isSome
assert conf.rlnRelayConf.isSome()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to follow this :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do note that std/options doc do not use () so you may be at odd with the ecosystem: https://nim-lang.org/docs/options.html#isSome%2COption%5BT%5D

## Then
let conf = res.get()
assert conf.validate().isOk()
assert conf.clusterId == clusterConf.clusterId
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When comparing two values, I think is better to use check

Suggested change
assert conf.clusterId == clusterConf.clusterId
check conf.clusterId == clusterConf.clusterId

assert is useful when validating a Result and then we can give feedback from the result error.
In this particular case, something like that can also be done with assert:

assert conf.clusterId == clusterConf.clusterId, "failed comparing clusterId"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused, are you suggesting I use check or suggesting I add the comment to assert?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho check is better for such comparison as when it fails it will print the actual and the expected value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did that, thanks!

Comment on lines 142 to 145
dns4DomainName = conf.dns4DomainName.map(
proc(dn: DomainName): string =
dn.string
),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we also change the dns4DomainName init's parameter's type to DomainName?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not yet sure what is the best usage of the distinct types.

It's best if the sub-protocol define its own distinct types from its own config object. But I don't think I'll do that in this PR. I think I will want to tidy up and have the "one config object per protocol, defined in the protocol module, and imported + nested in WakuConf" approach done in a follow-up PR. When I do that, then we can review distinct type usatge.

Comment on lines 3 to 4
# TODO: this file should be called cluster_conf.nim

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of renaming the file I think is better to rename the type because we are used to that file/module name when looking for in on VScode

@@ -86,17 +86,24 @@ proc initNode(
else:
peerStore.get()

let (secureKey, secureCert) =
if conf.webSocketConf.isSome and conf.webSocketConf.get().secureConf.isSome:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if conf.webSocketConf.isSome and conf.webSocketConf.get().secureConf.isSome:
if conf.webSocketConf.isSome() and conf.webSocketConf.get().secureConf.isSome():

@@ -86,17 +86,24 @@ proc initNode(
else:
peerStore.get()

let (secureKey, secureCert) =
if conf.webSocketConf.isSome and conf.webSocketConf.get().secureConf.isSome:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if conf.webSocketConf.isSome and conf.webSocketConf.get().secureConf.isSome:
if conf.webSocketConf.isSome() and conf.webSocketConf.get().secureConf.isSome():

relayShardedPeerManagement*: bool
relayServiceRatio*: string

proc log*(conf: WakuConf) =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is much better to have a more especific proc name. Having too generic proc names makes it more difficult to lookup in VSCode.

Suggested change
proc log*(conf: WakuConf) =
proc logWakuConf*(conf: WakuConf) =

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is much better to have a more especific proc name. Having too generic proc names makes it more difficult to lookup in VSCode.

that sucks (lookup in VSCode).

I am a big proponent of "don't repeat context" but I see your point. I'll think on how I can cope with it :)

@fryorcraken fryorcraken force-pushed the separate-node-config branch from b09f19b to 63f0f20 Compare April 10, 2025 11:13
@fryorcraken
Copy link
Collaborator Author

One note on defaults.

I think that the builder should apply agnostic defaults, that work for both wakunode2, libwaku in a desktop app, libwaku in a mobile app, etc.
And that then, the CLI WakuNodeConf can apply/override defaults that are good for wakunode2, probably as a service node.
This raise the whole problem of wakunode2 being an everything binary. My intent is to split wakunode2 in more specialised binaries (I will leave wakunode2 as it is but I hope we can use my new binaries and deprecate it over time).

The first point means that most likely, the config builders should be near their config, in the module of the logic (eg rln relay), because this module is best to know/define good agnostic defaults. And then force setting the parameters that don't have a good generalized/agnostic default (ie, they can change widely from an edge node to a service node).
I'll do that in a follow-up PR.

cc @gabrielmer

Finally, regarding wakunode2 and binaries. It is interesting to have a "everyting" binary for simulation and research, so one can tweak parameters etc.

But I wonder if a better approach would be either:

  1. A new protocol is likely to be contextual. Eg store sync is for store nodes. hence using a more specialized service-node is adapted
  2. Or, encouraging research team to just create a new binary to PoC new protocols.

For (2) to happen, messaging API, and the work I am doing here, should useful so it's super easy to create a new binary, without having to copy-past WakuNodeConf etc, and just have the parameters you need.

WDYT @jm-clius @Ivansete-status @AlbertoSoutullo

Copy link

github-actions bot commented Apr 11, 2025

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:3357

Built from 152e1f4

@fryorcraken fryorcraken force-pushed the separate-node-config branch from 8c587c9 to a3addba Compare April 11, 2025 11:35
rlnRelayTreePath* {.
desc: "Path to the RLN merkle tree sled db (https://github.com/spacejam/sled)",
defaultValue: "",
name: "rln-relay-tree-path"
.}: string

rlnRelayBandwidthThreshold* {.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the one deleted were actually not used

@jm-clius
Copy link
Contributor

For (2) to happen, messaging API, and the work I am doing here, should useful so it's super easy to create a new binary, without having to copy-past WakuNodeConf etc, and just have the parameters you need.

For research dev testing, it could be workable - presumably a researcher developing e.g. a new protocol for lightpush incentivisation, would then have to create new binaries based on the existing edge and service specialised binaries? Regression testing could be more difficult, as we will constrain the use case to what is set up for that specific binary.

What would the impact be on QA/DST - they might want to start testing new protocols in parallel (as with Waku Sync). Would they now need to run simulation frameworks for each of n binaries? Also, what are your thoughts on heterogenous use cases? I know for now we often have clear service nodes and clear edge nodes, but I think that we'd see nodes act as both service and edge nodes in any combination or gradient if Waku is to work (e.g. a Store service node for a specific application would act as filter client on that application's content topics). The advantage of an everything binary is that we can see adaptive nodes in practice.

@AlbertoSoutullo
Copy link

AlbertoSoutullo commented Apr 11, 2025

I might have a huge lack of context here, but I can share the PoV from DST.

Nwaku CLI arguments is something that I talked with @Ivansete-status sometimes already.
I have worked in the past with binaries that could easily have more than 150 arguments. Having a lot of arguments is of course something that scares at the beginning, but I can say that there are 3 major things that decreases that difficulty substantially:

  1. Modular configurations. Basically, arguments that depend on other ones. I imagine this like a tree, where the bottom ones are "refinements", and the higher you go, the most important ones you find. This allows you to work with only the most important ones, unless you want to have something very specific, then you can add more arguments to reach the behaviour you want.
  2. Following closely to point 1, the default values are also something extremely important. In this case I think nwaku does a good work, because follows the conservative approach as in not adding functionalities.
  3. Hidden behaviours / related arguments. The perfect example here is --max-connections.
    Description in the Waku documentation: Maximum allowed number of libp2p connections.
    Questions that I might have when I read this: Does this mean full-peers? full-peers + metadata connections? Is it per stream?
    Then, what impact do we have with --max-connections?:
  • Are max-relay-peers affected? Can I have more relay peers than --max-connections? (It should be obvious that no, but there is no harm on clarifying it.). Also, I see that this parameter was deprecated on v0.35.0.
  • New parameter comes in play, relay-service-ratio (it does not appear in the documentation page yet). Very coupled to max-connections.

Another example that come to my mind is --cluster-id, --shard, --content-topic and --num-shards-in-network.
This one is actually crazy imo. It is very confusing and you have to dig a lot (or directly ask the developers) to understand what is going on here. I cannot even remember right now what the latest one could affect, and I have been using nwaku quite a bit.

--filter and --lightpush can also be confusing when you are using them for the first time. You tell the user that they can use waku as a service node, using "light protocols". Then what do I do? I go to my binary and set up filter to true, because I want my node to "use that protocol". The reality is literally the other way around, you want to use filternode to connect to another node that HAS to have filter flag enabled. So yeah, you are using "filter", but it is the node you are connecting to the one enabling that protocol. I am not saying this is correct or not, but I can confidently say this is confusing.

What do I want to say with all of this? I can see the point on (if I understood correctly) separate functionalities in different binaries, but I don't still know why, but this doesn't feel quire right to me. I guess it would depend on where you draw the line.
Do you want a binary for relay protocol and another one for filter? Does that mean that you want to run 2 binaries? If relay filter depends on relay, then you need to interconnect somehow your 2 local binaries.

To me, the current approach of nWaku looks OK, the problem that I see is how the information is delivered to the user, and the problems that there are in the argument themselves as in coupled behaviours.

Of course, again, I might not have the entire context here, and my opinion could change a bit depending on the information that I have. Because indeed this topic is quite complicated on its own, and then there is Waku itself, that it is not precisely an easy project.

Edit: Also, we have to take into account the importance of the previous points. Sadly, in Waku's case, we are not talking about minor optimization issues, like the node is a bit slower, consumes more RAM, or things like that.
Something as confusing as sharding and relay-service-ratio could literally imply the node to not connect properly in the network. Those are the ones (imo) that should be more carefully designed.

@fryorcraken fryorcraken force-pushed the separate-node-config branch from a3addba to aca1512 Compare April 12, 2025 00:53
Copy link

This PR may contain changes to configuration options of one of the apps.

If you are introducing a breaking change (i.e. the set of options in latest release would no longer be applicable) make sure the original option is preserved with a deprecation note for 2 following releases before it is actually removed.

Please also make sure the label release-notes is added to make sure any changes to the user interface are properly announced in changelog and release notes.

@fryorcraken
Copy link
Collaborator Author

  1. Modular configurations. Basically, arguments that depend on other ones. I imagine this like a tree, where the bottom ones are "refinements", and the higher you go, the most important ones you find. This allows you to work with only the most important ones, unless you want to have something very specific, then you can add more arguments to reach the behaviour you want.

This is something I am doing internally (see WakuConf), where indeed, it's a tree.

However, the tooling nim-confutils stops us to do something like that. We would need to do something different.
One option would be a new toml config file that matches the WakuConf tree structure and not use CLI args or nim-confutils.

  1. Hidden behaviours / related arguments. The perfect example here is --max-connections.

Yes, I have also thought the same , filter has a maxPeersToServe, but then there is also a maxConnections. I am guessing they all act as threshold? meaning having maxPeersToServe > maxConnections is equivalent to not setting the maxPeersToServer? Maybe @NagyZoltanPeter can highlight his thought on the matter and the ideal configuration for this

Another example that come to my mind is --cluster-id, --shard, --content-topic and --num-shards-in-network.
This one is actually crazy imo. It is very confusing and you have to dig a lot (or directly ask the developers) to understand what is going on here. I cannot even remember right now what the latest one could affect, and I have been using nwaku quite a bit.

Yep, I totally agree. I first fixed that by adding a --preset option, and then I am fixing the rest with a new draft PR (literally doing it now).
Again, the problem is the flat config. The (internal) tree structure fixes that.

--filter and --lightpush can also be confusing when you are using them for the first time.

Absolutely, in this PR, I am chaing the name to FilterService, etc to help.

What do I want to say with all of this? I can see the point on (if I understood correctly) separate functionalities in different binaries, but I don't still know why, but this doesn't feel quire right to me. I guess it would depend on where you draw the line.
Do you want a binary for relay protocol and another one for filter? Does that mean that you want to run 2 binaries? If relay filter depends on relay, then you need to interconnect somehow your 2 local binaries.

and

Also, what are your thoughts on heterogenous use cases?

My opinion is that the adaptive concept as a full gradient creates both engineer overload and UX/DevEx issues. @AlbertoSoutullo's witness statement reinforcing the case.

While Waku nodes are adaptive, I believe we should provide discreet configurations that we expect 80% (80/20 rule) of users and developers to use.
If developers/users want something outside these discreet configurations, we can always review, or they can always build their own using the library. The idea is to make those configuration easy to deploy (and leave the harder stuff for research or expert users).

I believe this actually matters a lot for Vac-DST. Because at the end of the day, we want simulation to match what users have in their hands. Namely, either a js-waku based web app, or a status app.

My intent is to provide several modes:

  • edge
  • relay
  • service
  • bootstrap

These should end up defined in the Messaging API. These are the mode that must be used by Status app, hence libwaku/c-binding should use those modes instead of enabling individual protocols.

For example relay is the relay mode as of today in Status Desktop (really happy if you have a better name):

  • uses relay to receive and send messages
  • uses p2p reliability protocol to send and receive messages
  • provide filter, light push and peer exchange services
  • uses rendezvous, discv5 and peer exchange for discovery

Now, when running simulations, this is what DST should be using. Because running a bandwidth simulation with only relay, is not really meaning full as Status Desktop also provide those services. So if we want to tell Status users that their bandwidth will average to X, then it needs to be done with filter and light push service enabled (as they are enabled in desktop)

Now, wakunode2 could remain for very specific simulations, such as "relay only", etc. But even tho, it should be possible to take the relay mode and disable discv5 and re-run a simulation. So that the default/baseline parameters are those of relay mode, but we tweak things to test simulations.

WDYT?

--

For research dev testing, it could be workable - presumably a researcher developing e.g. a new protocol for lightpush incentivisation, would then have to create new binaries based on the existing edge and service specialised binaries? Regression testing could be more difficult, as we will constrain the use case to what is set up for that specific binary.

Looking at chat2.nim, I can see that currently this would be a big ask. But the intent is that a researcher should be able to load a preset and a mode, and then just configured whatever they want for their new protocols.

I am not too sure what do you mean regarding regression testing. The researcher's code should still be in nwaku codebase. It just changing the way it's consumed:

  • old: add a config on wakunode2 and use wakunode2
  • new: add a config to WakuConf and build one or two new binaries

So I'd expect the tests to still run as expected.

@fryorcraken fryorcraken changed the title chore: split internal and CLI Waku Configs chore!: split internal and CLI Waku Configs Apr 12, 2025
@fryorcraken fryorcraken changed the title chore!: split internal and CLI Waku Configs chore!: separate internal and CLI configurations Apr 12, 2025
@fryorcraken fryorcraken force-pushed the separate-node-config branch 4 times, most recently from ac62667 to e051473 Compare April 13, 2025 14:48
Split `WakuNodeConfig` object for better separation of concerns and to introduce a tree-like structure to configuration.
@fryorcraken fryorcraken force-pushed the separate-node-config branch from e051473 to 06444ad Compare April 13, 2025 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants