Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup metrics and alerting #421

Closed
mattiekat opened this issue May 11, 2022 · 2 comments
Closed

Setup metrics and alerting #421

mattiekat opened this issue May 11, 2022 · 2 comments
Assignees
Labels

Comments

@mattiekat
Copy link
Contributor

No description provided.

@mattiekat mattiekat self-assigned this May 11, 2022
@mattiekat mattiekat mentioned this issue May 26, 2022
@nambrot nambrot moved this from In Progress to In Review in Hyperlane Tasks Jun 1, 2022
@mattiekat
Copy link
Contributor Author

An update on the state of this issue. The middlewear was created for extracting metrics from ethers which also gives us a useful framework for more in the future. Metrics are broken in the currently deployed agent version due to a duplicate label name in many of them between the const labels and dynamic ones. This has been fixed in main with #518.

The dashboard for k8s metrics is more or less done (can always add stuff if needed later but should be a really good start). The dashboard for agent metrics is partially done and is mostly waiting on the new metrics to be working.

@mattiekat mattiekat moved this from In Review to In Progress in Hyperlane Tasks Jun 6, 2022
@mattiekat
Copy link
Contributor Author

After brainstorming session the list of metrics we came up with are:

Priority:

  • Block height
  • Count of messages in outboxes (Overall and per agent) by destination chain
  • Count of messages processed by inbox by source chain
  • Current outbox state (active/failed)
  • Relayer balances
  • Latest signed index by validator

Other:

  • Block rate by chain
  • blocks each agent is behind the latest known
  • relayer inflight message forwards
  • relayer message forward time taken
  • gelato operation metric to track successes/failures and time (bundle of a few RPC calls)
  • ∆ between inboxes and outbox message processed
  • ∆ between outbox index and latest signed index by validator

This was referenced Jun 10, 2022
Repository owner moved this from In Progress to Done in Hyperlane Tasks Jun 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

No branches or pull requests

1 participant