Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API request returned HTTP 500: Internal Server Error #140

Open
rshad opened this issue Jan 21, 2020 · 4 comments
Open

API request returned HTTP 500: Internal Server Error #140

rshad opened this issue Jan 21, 2020 · 4 comments

Comments

@rshad
Copy link

rshad commented Jan 21, 2020

Issue Description

I'm trying to make a backup of all the repository with all their contents (PRs. Issues, Code, etc ...) " Taking into account, that some repos are huge (with more than 5000 issues for example)" and the binary github-backup fails sometimes with the error:

API request returned HTTP 500: Internal Server Error

I first thought that it's related to the reach the maximum requests rate permitted per hour "5000", but it's not the case; The error was produced when I still had ~ 2500 requests.

I also tried to make the backup of the Issues only, or of the PRs only and no error is produced.

I'm totally convinced, it's related to the huge number of requests being made, but I could not get to know the real reason.

Any ideas?

Kr,

Rshad

@einsteinx2
Copy link
Contributor

I've identified some issues with the error handling functions that will cause the script to terminate early when it shouldn't. I'm going to be working on a PR to fix it when I have some time.

The script has some code in it to do an automatic backoff when it hits a rate limiting error, but due to the above mentioned issues with the error handling, I don't believe that part it working correctly.

Though keep in mind that if you're trying to back up 10s of thousands of items or anything much higher than a total of 5000, it will take a very long time due to the rate limiting. I'd suggest breaking it up into multiple calls one every couple hours. For example only backup repos, the only PRs, then only issues, etc. That should at least speed things up a bit. In fact I wrote a script meant to be used with cron to easily allow backing up different things at different times without putting complicated commands into the crontab. I'm still tweaking it, but I'll probably submit it as a PR to include in the repo when I'm done with it.

@einsteinx2
Copy link
Contributor

You can see my open issue to refactor the error handling here for more detailed information or to track the progress: #138

@rshad
Copy link
Author

rshad commented Jan 22, 2020

Hi @einsteinx2 !

Thanks for answering.

Actually I was thinking about the same behavior. Running the backup by category "PRs, Issues, Wikis, .." for each repo of mine.

for category in categories:
    for repo in repositories:
         <backup category> # using github-backup
    sleep(10) # seconds

However, it would still be failing in case, the number of issues or PRs is so large.

I'll give it a try and I'll look at your solution once you get it.

Kr,

Rshad

@einsteinx2
Copy link
Contributor

My solution is more or less what you're doing, but using cron to put a >1 hour delay between each run and doing it overnight. That way your 5000 allowed requests get reset for each run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants