Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce e2e flakiness by changing protocol parameters #2301

Merged

Conversation

sfauvel
Copy link
Collaborator

@sfauvel sfauvel commented Feb 11, 2025

Content

Change protocol parameters to reduce e2e test flakiness.

We decrease the k parameter of the protocol to 145 to avoid an epoch gap due to a too low number of signatures (less than 150).
This PR also print the last aggregator error in the console to help investigation when tests fail.

Example of the error shown in log when test fail

----------------------------------------------------------------------------------------------------
MITHRIL-AGGREGATOR LOGS - LAST 1 ERROR(S):
----------------------------------------------------------------------------------------------------
10293-{"msg":">> update_epoch_settings","v":0,"name":"mithril-aggregator","level":20,"time":"2025-02-12T11:57:08.013719446+01:00","hostname":"sfauvel-XPS-15-9560","pid":238456,"src":"MithrilEpochService"}
10294-{"msg":"Inserting epoch settings in epoch 37","v":0,"name":"mithril-aggregator","level":20,"time":"2025-02-12T11:57:08.01372586+01:00","hostname":"sfauvel-XPS-15-9560","pid":238456,"src":"MithrilEpochService","epoch_settings":"AggregatorEpochSettings { protocol_parameters: ProtocolParameters { k: 250, m: 210, phi_f: 0.8 }, cardano_transactions_signing_config: CardanoTransactionsSigningConfig { security_parameter: BlockNumber(1), step: BlockNumber(15) } }"}
10295-{"msg":">> precompute_epoch_data","v":0,"name":"mithril-aggregator","level":20,"time":"2025-02-12T11:57:08.017775711+01:00","hostname":"sfauvel-XPS-15-9560","pid":238456,"src":"AggregatorRunner"}
10296-{"msg":">> precompute_epoch_data","v":0,"name":"mithril-aggregator","level":20,"time":"2025-02-12T11:57:08.017797842+01:00","hostname":"sfauvel-XPS-15-9560","pid":238456,"src":"MithrilEpochService"}
10297-{"msg":">> is_certificate_chain_valid","v":0,"name":"mithril-aggregator","level":20,"time":"2025-02-12T11:57:08.033187445+01:00","hostname":"sfauvel-XPS-15-9560","pid":238456,"src":"AggregatorRunner"}
10298:{"msg":"An error occurred, runtime state kept. message = 'certificate chain is invalid'","v":0,"name":"mithril-aggregator","level":50,"time":"2025-02-12T11:57:08.03436876+01:00","hostname":"sfauvel-XPS-15-9560","pid":238456,"src":"AggregatorRuntime","nested_error":"There is an epoch gap between the last certificate epoch (Epoch(31)) and current epoch (Epoch(35))

Stack backtrace:
   0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from
   1: <mithril_aggregator::services::certifier::certifier_service::MithrilCertifierService as mithril_aggregator::services::certifier::interface::CertifierService>::verify_certificate_chain::{{closure}}
   2: <mithril_aggregator::services::certifier::buffered_certifier::BufferedCertifierService as mithril_aggregator::services::certifier::interface::CertifierService>::verify_certificate_chain::{{closure}}
   3: <mithril_aggregator::runtime::runner::AggregatorRunner as mithril_aggregator::runtime::runner::AggregatorRunnerTrait>::is_certificate_chain_valid::{{closure}}
   4: mithril_aggregator::runtime::state_machine::AggregatorRuntime::cycle::{{closure}}
   5: mithril_aggregator::commands::serve_command::ServeCommand::execute::{{closure}}::{{closure}}
   6: tokio::runtime::task::core::Core<T,S>::poll
   7: tokio::runtime::task::harness::Harness<T,S>::poll
   8: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
   9: tokio::runtime::scheduler::multi_thread::worker::Context::run
  10: tokio::runtime::context::runtime::enter_runtime
  11: tokio::runtime::scheduler::multi_thread::worker::run
  12: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
  13: tokio::runtime::task::core::Core<T,S>::poll
  14: tokio::runtime::task::harness::Harness<T,S>::poll
  15: tokio::runtime::blocking::pool::Inner::run
  16: std::sys::backtrace::__rust_begin_short_backtrace
  17: core::ops::function::FnOnce::call_once{{vtable.shim}}
  18: std::sys::pal::unix::thread::Thread::new::thread_start
  19: start_thread
             at ./nptl/pthread_create.c:442:8
  20: __GI___clone3
             at ./misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81"}

Feb 12 10:57:08.088 INFO Stopping Mithril infrastructure
Feb 12 10:57:08.088 INFO Stopping mithril-signer-1-pool1hu9q6gg7w4s2v52n7kuengae4a7p2ym9rj46juxjqpfa5dh5dm4
Feb 12 10:57:08.095 INFO Stopping mithril-signer-2-pool1gfav880r9mhu5l7t5ymxmd5d0g4vrtj6faszzmgeks847sg87ym
Feb 12 10:57:08.103 INFO Stopping aggregator
Feb 12 10:57:08.114 INFO Stopping the Devnet, script: /tmp/mithril/devnet/stop.sh
>> Stop Cardano network
>> Stop Mithril network
cardano-node: thread killed
cardano-node: thread killed
cardano-node: thread killed
 
----------------------------------------------------------------------------------------------------
Mithril End to End test outcome:
----------------------------------------------------------------------------------------------------
Error(Unretryable): Mithril End to End test failed

Caused by:
    Minimum expected mithril stake distribution epoch not reached : 31 < 32

Pre-submit checklist

  • Branch
    • Tests are provided (if possible)
    • Crates versions are updated (if relevant)
    • CHANGELOG file is updated (if relevant)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
  • PR
    • No clippy warnings in the CI
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested
  • Documentation
    • Update README file (if relevant)
    • Update documentation website (if relevant)
    • Add dev blog post (if relevant)

Issue(s)

Relates to #2222

Copy link

github-actions bot commented Feb 11, 2025

Test Results

    4 files  ±0     56 suites  ±0   20m 54s ⏱️ +26s
1 597 tests +1  1 597 ✅ +1  0 💤 ±0  0 ❌ ±0 
1 901 runs  +1  1 901 ✅ +1  0 💤 ±0  0 ❌ ±0 

Results for commit 22ba641. ± Comparison against base commit f9b2d69.

♻️ This comment has been updated with latest results.

@sfauvel sfauvel temporarily deployed to testing-sanchonet February 11, 2025 16:14 — with GitHub Actions Inactive
@sfauvel sfauvel changed the title Sfa/2222/reduce flakiness by changing protocol parameters Reduce e2e flakiness by changing protocol parameters Feb 11, 2025
@sfauvel sfauvel force-pushed the sfa/2222/reduce_flakiness_by_changing_protocol_parameters branch from ca31781 to 1799104 Compare February 11, 2025 17:31
@sfauvel sfauvel marked this pull request as ready for review February 11, 2025 17:32
@sfauvel sfauvel temporarily deployed to testing-sanchonet February 11, 2025 17:41 — with GitHub Actions Inactive
Copy link
Member

@jpraynaud jpraynaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Collaborator

@Alenar Alenar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sfauvel sfauvel force-pushed the sfa/2222/reduce_flakiness_by_changing_protocol_parameters branch from bfca3a4 to ca72570 Compare February 12, 2025 15:01
@sfauvel sfauvel force-pushed the sfa/2222/reduce_flakiness_by_changing_protocol_parameters branch from ca72570 to 22ba641 Compare February 12, 2025 15:07
@sfauvel sfauvel temporarily deployed to testing-sanchonet February 12, 2025 15:22 — with GitHub Actions Inactive
@sfauvel sfauvel merged commit 31caf35 into main Feb 12, 2025
37 of 41 checks passed
@sfauvel sfauvel deleted the sfa/2222/reduce_flakiness_by_changing_protocol_parameters branch February 12, 2025 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants