Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry if call function on backend fails #242

Merged
merged 2 commits into from
Nov 18, 2024
Merged

Retry if call function on backend fails #242

merged 2 commits into from
Nov 18, 2024

Conversation

hagenw
Copy link
Member

@hagenw hagenw commented Nov 11, 2024

When publishing large datasets that take longer than 24 hours, I encountered network related errors for both Artifactory and S3 servers. They seem to be related with loosing the connection while uploading a file, which will then raise an error, and stops the whole process. To avoid this, this pull request proposes to not directly fail, but try two times more to rerunning the requested job on the backend, with a short delay of 50 ms between the tries.

I tested it on a 27 hour job on S3, which failed three times before, and did now proceed without any error for two times.
As I don't see any unwanted consequences, I would propose to integrate the changes.

Summary by Sourcery

Enhancements:

  • Add retry mechanism to call_function_on_backend to attempt function execution multiple times before raising an error.

Summary by Sourcery

Enhancements:

  • Add a retry mechanism to the call_function_on_backend function to attempt execution multiple times before raising an error.

Copy link
Contributor

sourcery-ai bot commented Nov 11, 2024

Reviewer's Guide by Sourcery

The PR implements a retry mechanism for backend function calls to handle transient network errors during long-running operations. The implementation adds a retry loop with a 50ms delay between attempts, making up to 3 tries before failing.

Sequence diagram for retry mechanism in backend function calls

sequenceDiagram
    participant Client
    participant Backend
    Client->>Backend: call_function_on_backend()
    loop Retry up to 3 times
        alt Function succeeds
            Backend-->>Client: Return result
        else Function fails
            Backend-->>Client: Raise error
            Client->>Client: Wait 50ms
        end
    end
Loading

File-Level Changes

Change Details Files
Added retry mechanism to backend function calls
  • Added new 'retries' parameter with default value of 3
  • Implemented retry loop with 50ms delay between attempts
  • Added docstring explaining the retry mechanism and parameters
  • Preserved existing error suppression behavior with fallback values
  • Only raises BackendError after all retry attempts are exhausted
audbackend/core/utils.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

codecov bot commented Nov 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.0%. Comparing base (1d0c713) to head (56d7758).
Report is 7 commits behind head on main.

Additional details and impacted files
Files with missing lines Coverage Δ
audbackend/core/utils.py 100.0% <100.0%> (ø)

... and 8 files with indirect coverage changes

@hagenw hagenw marked this pull request as ready for review November 15, 2024 19:22
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @hagenw - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider being more selective about which exceptions trigger a retry - catching all exceptions might mask programming errors that should fail immediately
Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
@hagenw hagenw requested a review from ChristianGeng November 15, 2024 19:35
Copy link
Member

@ChristianGeng ChristianGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

In discussion we have talked about the fact that we have simplistic requirements concerning retrying behavior. So for now we do not have to deal with exponential backoff and the like, but rather retry a thre times in an iteration.

Also, it is currently a deliberate action to have blocking behavior as seen in time.sleep(0.05).

Further, the argument to sleep which is hardcoded to 5ms does not have to be turned into a variable, at least not yet.

Therefore approval is given straight.

@hagenw hagenw merged commit 8da41a4 into main Nov 18, 2024
10 checks passed
@hagenw hagenw deleted the retries branch November 18, 2024 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants