-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
p2p: fan in incoming txns into backlog worker #6126
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6126 +/- ##
==========================================
- Coverage 51.73% 50.77% -0.97%
==========================================
Files 644 644
Lines 86519 86529 +10
==========================================
- Hits 44763 43937 -826
- Misses 38889 39683 +794
- Partials 2867 2909 +42 ☔ View full report in Codecov by Sentry. |
a65acca
to
2e055ca
Compare
5759b58
to
5635798
Compare
Merged master in. |
@@ -59,9 +61,18 @@ type SignedTxnWithAD struct { | |||
|
|||
// ID returns the Txid (i.e., hash) of the underlying transaction. | |||
func (s SignedTxn) ID() Txid { | |||
if s.cachedID != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious: did we test performance both with and without the cachedID
change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not recall but can perf tests
} | ||
|
||
if handler.checkAlreadyCommitted(wi) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we are cutting down on duplicated logic between what the backlogWorker is doing vs incoming message validation.
|
||
transactionMessagesRemember.Inc(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
relying on backlog worker for this?
@@ -2774,6 +2774,8 @@ func TestTxHandlerValidateIncomingTxMessage(t *testing.T) { | |||
|
|||
handler, err := makeTestTxHandler(ledger, cfg) | |||
require.NoError(t, err) | |||
handler.Start() | |||
defer handler.Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The handler variable gets set to a new test handler on line 2810 - which handler will this refer to when stop()
is called (I would think latest referenced object).
if err != nil { | ||
if err != pubsub.ErrSubscriptionCancelled && err != context.Canceled { | ||
n.log.Errorf("Error reading from subscription %v, peerId %s", err, n.service.ID()) | ||
const threads = incomingThreads / 2 // perf tests showed that 10 (half of incomingThreads) was optimal in terms of TPS (attempted 1, 5, 10, 20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was on our default specced hardware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put a few comments in, I think I follow the syncCh changes, assume perf testing was to isolate that change specifically (vs other ones in the PR) was done as well?
@@ -37,6 +37,8 @@ type SignedTxn struct { | |||
Lsig LogicSig `codec:"lsig"` | |||
Txn Transaction `codec:"txn"` | |||
AuthAddr basics.Address `codec:"sgnr"` | |||
|
|||
cachedID *Txid `codec:"-"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this done as a pointer so there would be a clear nil
test? Or to save space until it was computed? I would have expected an embedded [32]byte
ID and using the zero value as nil test. That will probably avoid an allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think though, if we want to carry cached IDs around (which sounds pretty reasonable), they should be in the Transaction, not in the SignedTransaction.
case handler.backlogQueue <- wi: | ||
action = <-wi.syncCh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the real meat of the PR then? You want to be able to treat the backlog handling go routine as a synchronous thing? Could we put out the synchronous code of the backlog handler and call it here instead?
Summary
While investigating p2p TX traffic and performance, I found transaction pool mutex congestion. This PR is a PoC to use
backlogWorker
as a pool only accessor similarly to wsnet.Implementation summary:
syncCh
channel to work itemwi
syncCh
is initiated and awaited making validation synchronous as before.backlogWorker
checks ifsyncCh
is set and responds with validation result to the channel.Additionally, there are couple more fixes that helped with TPS as well:
sub.Next()
faster.Test Plan