Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qbft new messages #111

Open
wants to merge 26 commits into
base: unstable
Choose a base branch
from
Open

Conversation

Zacholme7
Copy link
Member

@Zacholme7 Zacholme7 commented Jan 21, 2025

Issue Addressed

This PR attempts to align the QBFT logic closer to the specs.

Proposed Changes

There are a few changes

  • Introduction of a MessageContainer for storing all of the messages processed and received.
  • Move the types from the networking crate into ssv_types so that they can be used in both qbft and the network. Going to be a separate PR to remove these types from the networking crate and redirect logic to ssv_types
  • Adjust the QBFT manager to work with WrappedQbftMessage. This is analogous to ProcessingMessage in the spec. The manger will send WrappedQbftMessages to the qbft instance and the instance will send out UnsignedSSVMessage.
  • Remove EthSpec from the manager
  • Rework the types to work with serialized versions of EthSpec dependent types. Ex: DataSsz
  • Implement SSZ serialization and de-serialization for qbft data types. A qbft instance can come to consensus on any type that implements Data + Encode. This may have to be futher restricted to include Decode. This ensures that we are able to work in byte representation, like the spec.
  • Add message validation and other misc features to qbft

@dknopik
Copy link
Member

dknopik commented Jan 22, 2025

edit: sorry, toned down this comment a bit.

I am not happy with the approach to decide over Vec<u8> instead of a generic type.

  • Where will data validation occur? Do we need to deserialize for it?
  • I disagree that this is "closer to the spec". Effectively, the use of byte[] in their spec means "the SSZ representation of one of certain possible types" - the spec just does not explicitly express this though types, but through comments and through the values stored in the field.
  • Direct disadvantages include additional Serialization/Deserialization overhead, easier programming mistakes through type confusion (i.e. which type of data do we have in this Vec<u8>) and harder to read code.

@Zacholme7
Copy link
Member Author

Hey @dknopik those are valid points. Ill walk through my thought process and we can go from there

  1. I have been thinking of qbft as more of a data-agnostic process. It just cares about reaching consensus but does not inherently care about what that value is. It just needs to be able to identify if two pieces of data are the same (via the hash) without understanding their meaning. Under this assumption, data validation and de-serialization will happen outside of the qbft process. Right before we decide an instance, we ssz serialize the data and run a new instance with it. When this instance is complete, we retrieve the agreed upon value which gets deserialized back into the original type and perform the relevant validation to ensure it is our original value. This is the boundary where the specific type is being used.
  2. Yep I get that. By closer to the spec I was referring to more of the logic such as message structure validation, message containers, etc. While type safety is importnat, we maintain it where it matters which is at the boundaries of the qbft system where we are actually interpreting the data.
  3. Im a bit confused at this point. Im not sure why there is additional serialization/de-serialization I think we have to work out a explicit process of where we are going to be serializing/de-serializing so we are all on the same page. Here is an example of my thought process to make it concrete. Say we have a BeaconVote we need to agree on. We serialize this to Vec<u8> and then start a QBFT instance with this. When the qbft instance is complete, it will give us back a Completed<Vec<u8>> which we know is a BeaconVote, so we can de-serialize it back and perform any validation needed. Futhermore, since network messages must be serialized bytes anyway, working with Vec<u8> during consensus matches how the data exists during tranport. With generic types, we would need to deserialize on receipt and then re-serialize for the next network send, even when the qbft instance itself never needs to inspect the actual data.

Hopefully this makes sense, would like to hear your thoughts.

@AgeManning
Copy link
Member

AgeManning commented Jan 23, 2025

Disclaimer: I haven't gone through this PR, but have some familiarity with the QBFT code.

I agree that the QBFT process should be data agnostic, which is why I think its a good candidate for a generic. The generic D, allows us to come to consensus on anything.

In the case that we get something from the network that is SSZ serialized, i.e, raw bytes and we want to come to consensus on that without any kind of checks or validation, we could run a QBFT Consensus with D: being Vec<u8>. I think a more likely scenario will be that we have to decode the ssz (to make sure its not just junk), then likely do some validation on the resulting object. Since we have that object, i.e BeaconVote, we then run a QBFT instance with D being BeaconVote or some derived struct (because we might need to group validators and strip out public keys etc). Using straight Vec<u8> bypasses the type safety and would be nice to avoid for all the reasons @dknopik highlighted.

I'm not sure about the difficulty of keeping the D generic, which is maybe why @Zacholme7 wants to move from it.

The way I would imagine a message going in and out of anchor to look like is something like:
Message -> Network Stack (Gets decoded into a specific consensus type) -> Process/Manager/Validation -> QBFT Instance of that Type -> Network (Gets encoded and sent back). We can carry around the SSZ encoded version if we want to avoid the re-encoding and in fact I think we need to keep QBFT "payloads" of other operators around for round-changes anyway. But I think the SSZ encoding of a small struct takes very little overhead and might not be worth optimizing at this stage.

In terms of the types and where they live, I think it depends on the dependency graph of our crates.

For example in lighthouse, we have a network crate which depends on lighthouse_network (think anchor network) which depends on consensus types.
The consensus types is the lowest dependency in most crates, so every crate can import it. The next lowest crate is lighthouse_network. The lighthouse_network crate can therefore import consensus types. Then all other crates, including higher-level crates, now can import any consensus types and any network types.
If we were to follow this model for Anchor, i'd imagine a similar dependency tree, where the network types are separate from consensus types, but both live at the base of the dependency tree. Then QBFT could implement some of its own types, import network types and also consensus/anchor types.

Perhaps we can't do this for some reason, but this model has served us well in Lighthouse, so thought I'd mention it.

@Zacholme7
Copy link
Member Author

Zacholme7 commented Jan 23, 2025

Moving to generic D just within qbt instead of Vec<u8> is not very difficult. The difficult part is determining how generic we want to make this entire process. The entire process is very intertwined with SSV specific types. For example, operator signatures are used, verified, and included with messages which is ssv specific.

To make this fully generic and importable, we would have to work on defining some internal, ssv independent spec/type system that can map onto the SSV types. This is possible, but it does require work to get it to a place in which is is both fully generic and a perfect interop with go-ssv.

Talked about this a little bit with kingy too but even if we made this fully generic and importable it does not have much use without surrounding infrastructure in place like networking and at that point someone could just write their own.

with D: being Vec<u8>. I think a more likely scenario will be that we have to decode the ssz (to make sure its not just junk)

D cant be Vec<u8> which is why I opted to just go straight to Vec<u8>. D implements Data + Encode so we would just have to put this in a wrapper and impl the trait for the wrapper. We can definitely decode it to make sure that it is not junk by further restricting to Decode.

I think the best approach would be to keep some consensus specific generic ConsensusData and make sure this defines a hash() and a validate(). We can then utilize the generic as D: ConsensusData + Encode + Decode and have access to hashing, validations, encoding/decoding.

@AgeManning
Copy link
Member

Oh yeah. I think you're talking about maintaining the QBFT crate for others in the space. This was the original intent, but when we were building it, we decided to take that as far as reasonably possible. As soon as it got too hard ditch it.
Sounds like it's too hard, and i'm all for ditching total generics and make it just SSV specific.

Assuming its just for Anchor and SSV, I think D is useful.

D can't be vec<u8>

There must be a reason why D requires the trait bounds (I think i put them in there, but forgot why :p). If you go straight to Vec<u8> you must remove these trait bounds in the QBFT instance. If you can just remove these bounds, you should be able to remove them for D also. I might be missing something here, but I'm assuming anything you can do with a Vec<u8> you can also do with a generic D.

@AgeManning
Copy link
Member

Sorry my responses are vague. I'm in meetings all day. I will go over this code when I get a chance hopefully give some more helpful/informed responses :).

@dknopik
Copy link
Member

dknopik commented Jan 23, 2025

@Zacholme7 thanks for your detailed thought process in your response!

I think the best approach would be to keep some consensus specific generic ConsensusData and make sure this defines a hash() and a validate(). We can then utilize the generic as D: ConsensusData + Encode + Decode and have access to hashing, validations, encoding/decoding.

This sounds very good to me, and is similar to what I had in mind when I introduced the first version of the Data trait to the crate. What do you think about making Encode and Decode supertraits of ConsensusData? This way we reduce noise a bit in the rest of the crate.

@Zacholme7 Zacholme7 marked this pull request as ready for review January 23, 2025 22:15
@Zacholme7 Zacholme7 added ready-for-review This PR is ready to be reviewed QBFT labels Jan 23, 2025
@jking-aus
Copy link
Contributor

I like this implementation and agree we only take the make generic approach as far as practical. Defining a generic with the required traits is plenty imo. Gimmie the weekend to look at this but from first read I think we're good to go on the approach. Thanks for the good discourse fellas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
QBFT ready-for-review This PR is ready to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants