Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retries when loading config #887

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

nikola-jokic
Copy link

This PR adds a retry mechanism with exponential backoff when fetching config returns a timeout error.

Fixes #506

@palantirtech
Copy link
Member

Thanks for your interest in palantir/policy-bot, @nikola-jokic! Before we can accept your pull request, you need to sign our contributor license agreement - just visit https://cla.palantir.com/ and follow the instructions. Once you sign, I'll automatically update this pull request.

Copy link
Member

@bluekeyes bluekeyes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this - had a few suggestions, but overall I think this approach looks good.

fc.ParseError = err
} else {
fc.Config = &pc
ticker.Reset(delay)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use time.After here instead of manipulating a ticker - I think it will be easier to understand and I don't think there are any performance concerns with allocating a new timer if we're about to sleep for multiple seconds. Go 1.23 also fixed the issue with timers not being GC'd until after they fire.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I guess I got used to using ticker before 1.23 ☺️

Path: c.Path,
}
if err != nil {
if !os.IsTimeout(err) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be worth re-trying other errors too, for instance if GitHub returns a 500 error or something.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one, I guess the change needs to be done in palantir/go-githubapp since getFileContents receives a response and only propagates the error. In order to check for 500, there must either be an error type to check for on the caller side, or some other way to propagate 500 error somehow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The go-github library returns an error for any non-successful status code, so we have two options here without changing the upstream:

  • Retry on any error from loading the content and accept that maybe that will sometimes retry things that will still fail
  • Use errors.As to test for github.ErrorResponse and then check the status code in the response

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with error check.
lmkwyt ☺️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve error behavior when loading policies
3 participants