Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: faster integration tests #763

Open
wants to merge 8 commits into
base: next
Choose a base branch
from

Conversation

tomyrd
Copy link
Collaborator

@tomyrd tomyrd commented Feb 25, 2025

Integration test improvements

The objective of this PR was to update the integration tests with some improvements (mentioned in #506). The changes were meant to speed up the integration tests, mainly to reduce CI time but the results haven't been as significative.

Commits

I'll explain more in depth the changes made in each commit:

72df199

No improvements were made here, I just added a way to benchmark the node wait times of the wait_for_tx function. If an env var LOG_WAIT_TIMES=true is set, then each time wait_for_tx is called, a unique file called wait_time_... will be created with the millisecond wait time of that specific call. The scripts/integration-test-bench.sh script, sets the env var and runs the integration tests. It sums up all the wait times logged, prints the information and deletes the files.

a0cb9e7

While running cargo flamegraph I noticed that most of the CPU time of integration tests was spent on SecretKey::new_with_rng calls inside Client::new_transaction. This wasn't necessary (as discussed here). This commit changes the AuthSecretKey to an enum which indicates which authentication schema the account implements (without the need to create a new secret key which isn't used). This fix may be removed once this issue gets implemented.

d596d26

A lot of tests called the setup function which always creted 3 accounts (2 regular and a faucet). Some tests only needed 2 accounts (even some only used the faucet), which meant unnecessary account ID grinding. This commit reviewed each test and removed unused accounts.

ea87dff and 78ab276

This commits look to merge situations like this:

  • Execute a mint transaction
  • Wait for node
  • Consume authenticated note
  • Wait for node
  • (optionally) Execute p2id transaction
  • Wait for node
  • Consume authenticated note

The idea is to remove the intermediate waits by consuming unauthenticated transactions and only wait at the end.

These changes were made both in the rust and web integration tests.

Using the integration-test-bench.sh script I was able to measure the before and after sum of wait times and, altough the number of waits was reduced, the overall sum didn't change much which was surprising.

83e0a30

This small commit introduced a sleep in a case where we looped and synced, something like:

while ...{
    client.sync_state()
}

This generated a lot of sync requests which slowed the node.

The effect of node request saturation can be viewed by running the following tests:

  1. Run the integration tests with test-threads = 1.
  2. Run the integration tests with test-threads = 14.

d9308a8

Here, the cloned instance of the node is modified to reduce the block and batch times (by a factor of 10). This was by far the biggest improvement in wait time (at least in the local runs of integration tests).

When testing on the CI, the total time didn't change which is unfortunate. My current theory is that the runner resources are very limited and reducing block time doesn't affect it much.

Given that this change didn't improve the CI wait time I would remove it, as the lower block time makes some tests less reliable.

c7dcf55

Here the idea was to use a separate rust prover for the web tests (proving in wasm is very slow at the moment).

On the CI the web tests were ~7 minutes faster but the rust prover took ~7 minutes to build so the net time was almost the same.

I would also remove these changes before merging as they don't have much effect.

Conclusions

The CI test time variance is somewhat high so it's hard to measure the improvement without running a lot of instances and getting an average.

It's not clear if these changes make a difference in CI duration but most of the remaining issues in #506 are addressed here. I would remove the last two commits before merging as they were somewhat hacky attempts to improve test times without much of an effect (I'm leaving them here so that they can be discussed if needed).

Other possible improvements

  • Find a way to run the tests with a shared node. Each integration test job spends ~5 min building the node. If we could use an existing node in the tests then this time would be skipped. We can't use the deployed devnet node as it's common to change node versions between branches and it would loose compatibility rather quickly. The same could go for a deployed remote prover for web tests.
  • Remove some tests. Some tests could be merged so the number of created accounts and transactions is reduced which could lead to lower durations.

@tomyrd tomyrd force-pushed the tomyrd-faster-integration-tests branch 2 times, most recently from ef00631 to 923f70e Compare February 27, 2025 18:07
@tomyrd tomyrd force-pushed the tomyrd-faster-integration-tests branch 3 times, most recently from 4064932 to 6eb757f Compare February 27, 2025 22:06
@tomyrd tomyrd force-pushed the tomyrd-faster-integration-tests branch from 6eb757f to 710b43b Compare February 28, 2025 15:11
@tomyrd tomyrd force-pushed the tomyrd-faster-integration-tests branch from 710b43b to c7dcf55 Compare February 28, 2025 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant