Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add FlareCleanupTask #947

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

add FlareCleanupTask #947

wants to merge 1 commit into from

Conversation

nora-codecov
Copy link
Contributor

New strategy for managing flare storage

depends on codecov/shared#450

When a pull is no longer OPEN, leave the flare in order to reduce locks and wait time.
During low traffic hours (4am UTC?), run this task to clear it out of our database and Archive storage.
So we won't pay for maintaining the flare field but also won't slow down other pull actions by checking or clearing the field 🎉

@nora-codecov nora-codecov requested a review from Swatinem December 6, 2024 07:36
Copy link

github-actions bot commented Dec 6, 2024

This PR includes changes to shared. Please review them here: codecov/shared@45252f7...9f5c4fd

Copy link

codecov bot commented Dec 6, 2024

Codecov Report

Attention: Patch coverage is 25.00000% with 63 lines in your changes missing coverage. Please review.

Project coverage is 26.14%. Comparing base (96bed1f) to head (8b57acf).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
tasks/tests/unit/test_flare_cleanup.py 13.79% 50 Missing ⚠️
tasks/flare_cleanup.py 50.00% 13 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (96bed1f) and HEAD (8b57acf). Click for more details.

HEAD has 3 uploads less than BASE
Flag BASE (96bed1f) HEAD (8b57acf)
unit 2 1
integration 2 0
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #947       +/-   ##
===========================================
- Coverage   97.99%   26.14%   -71.85%     
===========================================
  Files         445      447        +2     
  Lines       35754    35838       +84     
===========================================
- Hits        35036     9371    -25665     
- Misses        718    26467    +25749     
Flag Coverage Δ
integration ?
unit 26.14% <25.00%> (-64.48%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@codecov-notifications
Copy link

Codecov Report

Attention: Patch coverage is 25.00000% with 63 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
tasks/tests/unit/test_flare_cleanup.py 13.79% 50 Missing ⚠️
tasks/flare_cleanup.py 50.00% 13 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (96bed1f) and HEAD (8b57acf). Click for more details.

HEAD has 3 uploads less than BASE
Flag BASE (96bed1f) HEAD (8b57acf)
unit 2 1
integration 2 0

📢 Thoughts on this report? Let us know!

Copy link

github-actions bot commented Dec 6, 2024

✅ All tests successful. No failed tests were found.

📣 Thoughts on this report? Let Codecov know! | Powered by Codecov

Copy link
Contributor

@Swatinem Swatinem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of suggestions:

  • We should definitely limit / batch the query. This could yield millions of entries on the first run for all we know 🤷🏻‍♂️
  • Using the mock_storage fixture, you could assert the actual side effect of files being removed from the storage backend, instead of manually mocking very specific ArchiveService methods

Otherwise this looks good, and the following are just "personal preference" suggestions:

  • IMO, we should move core business logic out of the celery framework code. In this case this would mean having some free functions instead of all the logic being in FlareCleanupTask.run_cron_task. This means you don’t even need the db_session argument, as the code does not use sqlalchemy models, and it should make testing less awkward.
  • I’m personally not a fan of asserting log calls, but rather just the side effects we are aiming for: files being removed from storage, fields being cleared from the DB.

Comment on lines +49 to +50
# single query, loads all pulls and repos in qset into memory, deletes file in Archive 1 by 1
for pull in non_open_pulls_with_flare_in_archive:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, the "load all pulls … into memory" part can be potentially dangerous as it might overwhelm the task.

This should use a limit query and work in batches.

Comment on lines +51 to +52
archive_service = ArchiveService(repository=pull.repository)
archive_service.delete_file(pull._flare_storage_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it might be worth grouping these by repository(.id) on the python side, to deduplicate creation of per-repo ArchiveService. Plus, I believe there is a delete_files method to do batch deletions as well. Though I believe that depending on the storage backend, that will still delete each file individually.

_flare_storage_path__isnull=False
).select_related("repository")
log.info(
f"FlareCleanupTask will clear {non_open_pulls_with_flare_in_archive.count()} Archive flares"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this .count() is a query on its own, which seems a bit unreasonable just for a log line?

Comment on lines +20 to +22
mock_archive_service = mocker.patch(
"shared.django_apps.utils.model_utils.ArchiveService"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of mocking the ArchiveService, you could use the mock_storage fixture (I believe that should be hooked up to the shared archive service).

That way, you can actually store files (in memory), and assert that those files are truely being deleted from the storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants