Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Integration test improvements
The objective of this PR was to update the integration tests with some improvements (mentioned in #506). The changes were meant to speed up the integration tests, mainly to reduce CI time but the results haven't been as significative.
Commits
I'll explain more in depth the changes made in each commit:
72df199
No improvements were made here, I just added a way to benchmark the node wait times of the
wait_for_tx
function. If an env varLOG_WAIT_TIMES=true
is set, then each timewait_for_tx
is called, a unique file calledwait_time_...
will be created with the millisecond wait time of that specific call. Thescripts/integration-test-bench.sh
script, sets the env var and runs the integration tests. It sums up all the wait times logged, prints the information and deletes the files.a0cb9e7
While running
cargo flamegraph
I noticed that most of the CPU time of integration tests was spent onSecretKey::new_with_rng
calls insideClient::new_transaction
. This wasn't necessary (as discussed here). This commit changes theAuthSecretKey
to an enum which indicates which authentication schema the account implements (without the need to create a new secret key which isn't used). This fix may be removed once this issue gets implemented.d596d26
A lot of tests called the
setup
function which always creted 3 accounts (2 regular and a faucet). Some tests only needed 2 accounts (even some only used the faucet), which meant unnecessary account ID grinding. This commit reviewed each test and removed unused accounts.ea87dff and 78ab276
This commits look to merge situations like this:
The idea is to remove the intermediate waits by consuming unauthenticated transactions and only wait at the end.
These changes were made both in the rust and web integration tests.
Using the
integration-test-bench.sh
script I was able to measure the before and after sum of wait times and, altough the number of waits was reduced, the overall sum didn't change much which was surprising.83e0a30
This small commit introduced a sleep in a case where we looped and synced, something like:
This generated a lot of sync requests which slowed the node.
The effect of node request saturation can be viewed by running the following tests:
test-threads = 1
.test-threads = 14
.d9308a8
Here, the cloned instance of the node is modified to reduce the block and batch times (by a factor of 10). This was by far the biggest improvement in wait time (at least in the local runs of integration tests).
When testing on the CI, the total time didn't change which is unfortunate. My current theory is that the runner resources are very limited and reducing block time doesn't affect it much.
Given that this change didn't improve the CI wait time I would remove it, as the lower block time makes some tests less reliable.
c7dcf55
Here the idea was to use a separate rust prover for the web tests (proving in wasm is very slow at the moment).
On the CI the web tests were ~7 minutes faster but the rust prover took ~7 minutes to build so the net time was almost the same.
I would also remove these changes before merging as they don't have much effect.
Conclusions
The CI test time variance is somewhat high so it's hard to measure the improvement without running a lot of instances and getting an average.
It's not clear if these changes make a difference in CI duration but most of the remaining issues in #506 are addressed here. I would remove the last two commits before merging as they were somewhat hacky attempts to improve test times without much of an effect (I'm leaving them here so that they can be discussed if needed).
Other possible improvements