Poldercast Vicinity Layer Not Returning "Nearest" Neighbours #15

michaeljfazio · 2020-02-06T16:12:37Z

An interesting pattern has shown up with nodes allocating their own IDs. Many are choosing to use IDs prefixed with a large number of 0 characters.

Here is a lexicographically ordered dump of peers gossiped on the network from yesterday.

After running a node with a similar number of 0 prefixed characters, one observes an extremely large increase in the number of block announcements received.

The reason is that the Vicinity layer is not taking the "nearest nodes" as described in the PolderCast Paper. Rather it is ordering them, according to the proximity function, but then simply taking the top N entries:

poldercast/src/poldercast/vicinity.rs

Lines 85 to 90 in f882186

    
           profiles 
        
               .into_iter() 
        
               .take(max) 
        
               .map(|v| v.id()) 
        
               .copied() 
        
               .collect()

Nearest neighbours should be taken as a window around the nodes own ID after having been sorted by the proximity function.

The text was updated successfully, but these errors were encountered:

NicolasDP · 2020-02-06T16:29:35Z

Yes and this is why the ID needs to be removed. This was done to facilitate management of the network protocols utilising poldercast. This was wrong. There should not be any ID propagated on the network.

michaeljfazio · 2020-02-06T16:33:05Z

Yes and this is why the ID needs to be removed. This was done to facilitate management of the network protocols utilising poldercast. This was wrong. There should not be any ID propagated on the network.

The IDs are not the problem though? In Jormungandr we can just generate the IDs by hashing IP addresses. The problem here is where the layer fetches the nodes from the sorted list.

NicolasDP · 2020-02-06T16:34:36Z

The list is sorted by the proximity function which is based on topic preferences, not on IDs.which is exactly what vicinity is about.

michaeljfazio · 2020-02-06T16:36:17Z

But the available nodes are stored in a BTreeSet keyed by ID. The layer fetches all the nodes from this set and then applies the proximity sort order. Given that all pools subscribe to the same topic with the same priority, the order is basically guaranteed to be lexicographically ordered by ID.

Then the layer just takes the top 20. Which are always the same nodes.

NicolasDP · 2020-02-06T16:52:31Z

It is a Vec not a BTreeMap:

poldercast/src/poldercast/vicinity.rs

Line 76 in f882186

mut profiles: Vec<&Node>,

EDIT

okay I answered too quickly indeed that all node selected are ordered by ID, the introduction of the new unstable sort you have added in your last PR #9 does not guarantee the original order is kept, so this is not guaranteed to be true.

The the proximity function will still reorder things by topic/preferences which is what we expect from the paper. Now we can still improve things with not allowing the users to set their own ID is necessary, this was an unfortunate mistake and allows a user to trick the game.

However we are still doing what we were supposed to do, and this is one of the limitation of poldercast in the context of few topics:

candidates subscribed to topics annotated with higher priority by the target node are ranked closer compared to candidates of lower priority topics. Among candidate nodes that rank equally in terms of topic priorities, proximity is determined by the number of topics shared with the target node: the more shared topics, the closer their ranking.

Also the vicinity is not supposed to change that often. This is the role of Cylon to provide random nodes. Vicinity is well defined as to be arbitrarily deterministic. Which is what is being done below:

poldercast/src/topic.rs

Lines 75 to 91 in f882186

    
               pub fn proximity_to(&self, other: &Self) -> Proximity { 
        
                   let mut priority_score = 0; 
        
                   let mut proximity_score = 0; 
        
                   for subscription in self.iter() { 
        
                       if let Some(other_subscription) = other.0.get(subscription) { 
        
                           proximity_score += 1; 
        
                           priority_score += subscription 
        
                               .interest() 
        
                               .priority_score(other_subscription.interest()); 
        
                       } 
        
                   } 
        
                   Proximity { 
        
                       proximity: ProximityScore(proximity_score), 
        
                       priority: PriorityScore(priority_score), 
        
                   } 
        
               } 
        
           }

proximity_score: gives us the number of common subscriptions;
priority_score: gives us the score by priority level on the topic.

BUT I see where you are coming from and in the context of being used in Jörmungandr there is only a few topics to use (blocks and block contents), so yes your analysis is correct in this context, nodes are likely to be the same for everyone and this is likely going to make propagations of events less performant. So we could do something to improve the proximity function a little bit. We could do a couple of things to improve things:

remove the node id from the gossips and have it entirely created locally;
leverage the node's logs and prioritise nodes least recently used for a same ProximityScore, we could use Logs:: last_use_of(_: Topic) so at equal ProximityScore we still get the least recently used one to be prioritised.

function available here:

poldercast/src/logs.rs

Lines 29 to 31 in f882186

    
           pub fn last_use_of(&self, topic: Topic) -> Option<&SystemTime> { 
        
               self.last_use_of.get(&topic) 
        
           }

and we would need to update:

poldercast/src/node.rs

Lines 120 to 122 in f882186

    
               pub fn proximity(&self, other: &Self) -> Proximity { 
        
                   self.subscriptions.proximity_to(&other.subscriptions) 
        
               }

This will make us deviate a bit more from the original paper, but would improve the vicinity selection by adding a bit more randomness.

michaeljfazio · 2020-02-06T16:55:51Z

That vector originates from this BTreeSet:

poldercast/src/nodes.rs

Line 10 in f882186

available: BTreeSet<Id>,

Which is returned by this method:

poldercast/src/nodes.rs

Line 62 in f882186

pub fn all_available_nodes(&self) -> Vec<&Node> {

That method collects the set in sorted order (by key value, i.e. ID) and transforms it to a vector. Which is in the same order.

NicolasDP · 2020-02-06T16:56:10Z

I will edit my comment accordingly.

michaeljfazio · 2020-02-06T16:58:12Z

I will edit my comment accordingly.

Does that mean the analysis is correct?

NicolasDP · 2020-02-06T17:22:07Z

Does that mean the analysis is correct?

yes, in the context of Jörmungandr you are right. It is likely the nodes selected by vicinity will be the same. I have updated my comment. Thanks a lot for spotting this issue.

michaeljfazio · 2020-02-06T17:26:37Z

Go team Jormungandr! 🎉

NicolasDP · 2020-02-06T17:33:30Z

However do you agree the title of the issue is no longer correct?. The proximity function is correct as per the paper. We are actually returning the nearest ones. Just that in the case of IOHK’s product this is not enough as it risks to return always the same kind of nodes due to the ID and the few numbers of topic.

I am thinking that having an Id that is randomly generated would be better instead of a hash. I don’t recall the paper being explicit about using the IP address. Also in the case of jormungandr there are nodes that don’t have IP addresses.

michaeljfazio · 2020-02-06T17:48:53Z

However do you agree the title of the issue is no longer correct?. The proximity function is correct as per the paper. We are actually returning the nearest ones. Just that in the case of IOHK’s product this is not enough as it risks to return always the same kind of nodes due to the ID and the few numbers of topic.

No not really. The sorting function I believe to be correct. But the last part (fetching the actual neighbours of the node) is not correct.

You have 10 of the letters in the alphabet and you are trying to find the closest neighbours to M. So you sort the letters alphabetically. Then search around M to find JKL[M]NOP. At the moment, ABCDEFG (or whatever letters you have in your bucket) are always returned. This clearly cannot be the nearest neighbour as it lacks an "anchor" point (e.g. M).

I am thinking that having an Id that is randomly generated would be better instead of a hash. I don’t recall the paper being explicit about using the IP address. Also in the case of jormungandr there are nodes that don’t have IP addresses.

Personally I'd just take the ip (and maybe the port), hash it with BLAKE and then take the first 24 characters from that. That way, any node can always validate the IP address and port against the hash.

I guess one problem however is, some nodes (i.e. wallet nodes) don't necessarily know their IP address! So, they need to be random.... Thinking....

michaeljfazio · 2020-02-06T18:26:56Z

okay I answered too quickly indeed that all node selected are ordered by ID, the introduction of the new unstable sort you have added in your last PR #9 does not guarantee the original order is kept, so this is not guaranteed to be true.

Node IDs are unique. So unstable sort is not a problem as there will never be two nodes in the set of nodes that have equality (only because indirectly the nodes are first sorted by ID, then by proximity, thus pool 000000000000... with topics/priorities (BLOCKS:HIGH + MESSAGES:HIGH) always at the top).

BUT I see where you are coming from and in the context of being used in Jörmungandr there is only a few topics to use (blocks and block contents), so yes your analysis is correct in this context, nodes are likely to be the same for everyone and this is likely going to make propagations of events less performant.

I think you nailed it there. Regardless of the implementation, i cant really see much point in the business of topics, or priority for that matter. Not unless there is a long game here where it should become important? As you say, most nodes (certainly all stake pools), are interested in the same topics (blocks and fragments) and at the same level of priority (high).

Also the vicinity is not supposed to change that often. This is the role of Cylon to provide random nodes.

Agreed. Also why #16 is relevant to this issue.

remove the node id from the gossips and have it entirely created locally;

Good start. At some point in the future will have to discuss the security of this scheme still. As its a trustless p2p system I can think of a handful of issues. For example, I can just generate the same ID as another node and enter the network, thus disrupting that node.

leverage the node's logs and prioritise nodes least recently used for a same ProximityScore, we could use Logs:: last_use_of(_: Topic) so at equal ProximityScore we still get the least recently used one to be prioritised.

I think this might also be a good step in the right direction. But there is this from the PolderCast paper, which I quite like:

With respect to ridding the system from outdated links, PolderCast employs a proactive mechanism for removing dead neighbors from node views.
Whenever a node p gossips with a neighbor q, it temporarily removes q from the
respective module’s view, anticipating that q will respond and will be inserted
anew in p’s view. This way, dead neighbors are silently discarded, while alive
ones are refreshed. To prevent dead neighbors from remaining indefinitely in a
view, a node always selects to gossip with its least recently refreshed neighbor.

I'd like to think we could move in that direction long term as well.

michaeljfazio · 2020-02-06T18:44:21Z

Reading my last comment back to myself, I'm realising that this issue is also partially resolved if #16 is addressed. Because, if vicinity is populated by the smaller random subset managed by the Cyclon layer, then indeed sorting as-is will be perfectly fine because you are always sorting on a smaller random sample set - and within that subset, all you care about is subscription+priority sort order. I see where you are coming from now. Tell me if you agree with the following:

Each layer must maintain its own view / set of views.
Cyclon randomly samples the list of known peers at a deterministic cadence (say every 10 seconds). It updates its view, which still needs to have an upper bound.
Vicinity, also operating asynchronously, is periodically sampling Cyclons layer at some interval. It sorts Cyclons view in subscription+priority order (note sorting by ID is no longer relevant because its always a subset of Cyclons layer which is a subset of the entire network). Then we take the top N number of nodes for each topic. Where N should be equal to the number of nodes that we wish to maintain a relationship with per topic.
Rings layer, then takes Vicinity layers output as its input. It maintains the predecessor/successor relationship (based on the node IDs sort order!).

The key thing here is this sentence from the PolderCast paper:

The Vicinity module is responsible for feeding the Rings module with a few neighbors for each topic t ∈ Tp, of arbitrary ids

I guess Vicinity could maintain a bounded binary heap sorted by priority for each topic?

Also... In the meantime while this and #16 is addressed - we could just disable the vicinity layer which will remove all gaming of the system and should be significantly better than what is currently going on (because of all that traffic being sent to the top N of all stake nodes ordered lexicographically by node ID).

michaeljfazio · 2020-02-07T10:48:41Z

Another relevant piece of information that isn't currently implemented correctly:

A node’s profile contains (i) its IP address and port number, (ii) its (unique) node id, and (iii) the ids of topics the node is subscribed to, each annotated with a priority that node assigns to finding neighbors of that topic. The priority of a topic is determined by the number of neighbors it has in the Rings module: topics with fewer Rings neighbors are assigned higher priority.

Current implementation assumes the node specifies both topic and priority. But in-fact priority is a calculated and dynamic value used to ensure that overlay rings that are less well established have a higher priority of being processed.

NicolasDP · 2020-02-07T19:34:17Z

I have made a new release 0.12 that prevents the gaming as you are describing. I have talked with the dev team at IOHK to see what is the highest priority in term of items to implements:

the changes

I have made a breaking change release of poldercast. To verify though (there is still an `Id` in the `NodeProfile` to allow compatibility with previous versions (there is an `Id` randomly generated in the `NodeProfile` (Gossip)).

However code using the latest version will need to:

remove the use of Id ;
use Address instead when receiving gossips from / with (either the peer_addr() if this is a publicly non reachable node or the gossiped Address

Improvements are:

one can now set a Topology DB limits (limiting the number of nodes in the Topology). Internally it uses a LRU and the least used ones will be removed from the node when the capacity is reached; Using this will prevent the collection to overgrow itself and to slow the users of this library;
using Address (and HashSet<Address>) it is no longer possible to game easily how the nodes are organised within the vicinity;
some performance improvements by using less clones;

These addresses the issues with vicinity where users where able to game the implementation too easily: now it is based on the address and it is hashed so the order is somewhat less predictable.

follow up improvements will be to leverage the LRU too in the cyclon to prioritise the least used nodes, I will make a new issue for that so it gets easier to follow on the path I will take this crate.

NicolasDP added the enhancement New feature or request label Feb 6, 2020

rinor mentioned this issue Feb 7, 2020

Jormungandr v0.8.7 is not propagating blocks to the network input-output-hk/jormungandr#1625

Open

NicolasDP mentioned this issue Feb 7, 2020

Improve Cyclon implementations by favouring least recently used entries instead of random ones #17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poldercast Vicinity Layer Not Returning "Nearest" Neighbours #15

Poldercast Vicinity Layer Not Returning "Nearest" Neighbours #15

michaeljfazio commented Feb 6, 2020 •

edited

Loading

NicolasDP commented Feb 6, 2020

michaeljfazio commented Feb 6, 2020

NicolasDP commented Feb 6, 2020

michaeljfazio commented Feb 6, 2020 •

edited

Loading

NicolasDP commented Feb 6, 2020 •

edited

Loading

michaeljfazio commented Feb 6, 2020

NicolasDP commented Feb 6, 2020

michaeljfazio commented Feb 6, 2020

NicolasDP commented Feb 6, 2020

michaeljfazio commented Feb 6, 2020

NicolasDP commented Feb 6, 2020 •

edited

Loading

michaeljfazio commented Feb 6, 2020 •

edited

Loading

michaeljfazio commented Feb 6, 2020 •

edited

Loading

michaeljfazio commented Feb 6, 2020 •

edited

Loading

michaeljfazio commented Feb 7, 2020

NicolasDP commented Feb 7, 2020

Poldercast Vicinity Layer Not Returning "Nearest" Neighbours #15

Poldercast Vicinity Layer Not Returning "Nearest" Neighbours #15

Comments

michaeljfazio commented Feb 6, 2020 • edited Loading

NicolasDP commented Feb 6, 2020

michaeljfazio commented Feb 6, 2020

NicolasDP commented Feb 6, 2020

michaeljfazio commented Feb 6, 2020 • edited Loading

NicolasDP commented Feb 6, 2020 • edited Loading

EDIT

michaeljfazio commented Feb 6, 2020

NicolasDP commented Feb 6, 2020

michaeljfazio commented Feb 6, 2020

NicolasDP commented Feb 6, 2020

michaeljfazio commented Feb 6, 2020

NicolasDP commented Feb 6, 2020 • edited Loading

michaeljfazio commented Feb 6, 2020 • edited Loading

michaeljfazio commented Feb 6, 2020 • edited Loading

michaeljfazio commented Feb 6, 2020 • edited Loading

michaeljfazio commented Feb 7, 2020

NicolasDP commented Feb 7, 2020

michaeljfazio commented Feb 6, 2020 •

edited

Loading

michaeljfazio commented Feb 6, 2020 •

edited

Loading

NicolasDP commented Feb 6, 2020 •

edited

Loading

NicolasDP commented Feb 6, 2020 •

edited

Loading

michaeljfazio commented Feb 6, 2020 •

edited

Loading

michaeljfazio commented Feb 6, 2020 •

edited

Loading

michaeljfazio commented Feb 6, 2020 •

edited

Loading