Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: peerdas - ensure there are at least n peers per sampling column subnet #7274

Open
wants to merge 11 commits into
base: peerDAS
Choose a base branch
from

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Dec 3, 2024

Motivation

  • Lodestar follows subnet sampling strategy. For each block, it's required at least 8 data columns from sampling subnets
  • sampling subnet peers are critical for PeerDAS and we don't have a mechanism to maintain at least n number of peers per subnet for now
  • in the worst case scenario, our beacon node may loose all peers on a column subnet and stay out of synced and we have no way to deal with it for now. If we have too few peers, we may not process blocks on time in order to have a good attestation performance
  • this PR makes sure we have at least n number of peers per column sampling subnet

Description

  • store custodySubnets in peersData so that we don't have to compute it all the times
  • use our subnet request mechanism for column sampling subnets
    • if a peer does not have any column subnets that we need, do not dial it
    • in average, a column subnet has 3.125 peer. I made default targetColumnSubnetPeers = 6 which make it easy to achieve after a few heartbeats but this could be changed with a newly added cli flag with the same name
    • this mechanism replace onlyConnect* flags
  • more metrics, this answer some questions:
    • what are column subnets we're sampling, how many peers per subnet?
    • number of column subnets requested, how many peers requested
    • reason for not dialing a peer
    • csc value of each ENR

@twoeths twoeths changed the title Te/peerdas prioritize peers feat: use subnet request mechanism for subnet sampling strategy Dec 3, 2024
@twoeths twoeths marked this pull request as ready for review December 10, 2024 10:04
@twoeths twoeths requested a review from a team as a code owner December 10, 2024 10:04
@matthewkeil
Copy link
Member

review note: look at packages/beacon-node/src/sync/range/utils/peerBalancer.ts and talk to @twoeths about sync strategy and how this PR affects that (or should it if not)

type CachedENR = {
peerId: PeerId;
multiaddrTCP: Multiaddr;
subnets: Record<SubnetType, boolean[]>;
addedUnixMs: number;
custodySubnetCount: number;
peerCustodySubnets: number[];
Copy link
Contributor

@g11tech g11tech Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this ain't the spec right? if not are you proposing a spec change in line with this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is implementation specific and not part of the spec
here we store peerCustodySubnets once to reuse later

if (oldMetadata === null || oldMetadata.csc !== peerData.metadata.csc) {
void this.requestStatus(peer, this.statusCache.get());
}
// TODO: why request status again?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should request metadata if metadata is old or non existent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to request status again in order to update network's datacolumns of this peer, will add that to the comment

@twoeths twoeths changed the title feat: use subnet request mechanism for subnet sampling strategy feat: peerdas - use subnet request mechanism for subnet sampling strategy Jan 13, 2025
@twoeths twoeths changed the title feat: peerdas - use subnet request mechanism for subnet sampling strategy feat: peerdas - ensure there are at least n peers per column subnet Jan 13, 2025
@twoeths twoeths changed the title feat: peerdas - ensure there are at least n peers per column subnet feat: peerdas - ensure there are at least n peers per sampling column subnet Jan 13, 2025
@twoeths
Copy link
Contributor Author

twoeths commented Jan 13, 2025

as requested by @g11tech I added back onlyConnect* flag for debug

Comment on lines +461 to 462
// TODO: could be optimized by directly using the previously calculated subnet
const dataColumns = getDataColumns(nodeId, peerCustodySubnetCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just remove in favor of peerCustodySubnets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

peerCustodySubnets are network subnets of data columns which is number[]
here we compute data columns which is different but the type is also number[]

/**
* A map of column subnet id to maxPeersToDiscover
*/
type ColumnSubnetQueries = Map<ColumnSubnetId, number>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is defined several places, maybe pick one place?

@@ -332,15 +332,21 @@ export class PeerManager {
});
if (peerData) {
const oldMetadata = peerData.metadata;
const csc = (metadata as Partial<peerdas.Metadata>).csc ?? this.config.CUSTODY_REQUIREMENT;
const nodeId = peerData?.nodeId ?? computeNodeId(peer);
const custodySubnets = (csc !== oldMetadata?.csc || oldMetadata?.custodySubnets == null)? getDataColumnSubnets(nodeId, csc) : oldMetadata?.custodySubnets;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would oldMetadata?.custodySubnets ever be null?
Maybe more clear to do

const custodySubnets = oldMetadata == null || csc !== oldMetadata.csc
  ? getDataColumnSubnets(nodeId, csc)
  : oldMetadata.custodySubnets;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants