-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove audbackend.checksum() and use MD5 sum #255
Conversation
Reviewer's Guide by SourceryThis PR fixes an issue with parquet file uploads by reverting back to using MD5 checksums instead of parquet metadata checksums. The implementation removes the Sequence diagram for parquet file upload with MD5 checksumsequenceDiagram
actor User
participant Interface
participant Artifactory
participant Audeer
User->>Interface: Upload parquet file
Interface->>Audeer: Calculate MD5 checksum
Audeer-->>Interface: Return MD5 checksum
Interface->>Artifactory: Upload file with MD5 checksum
Artifactory-->>Interface: Confirm upload
Interface-->>User: Upload successful
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @hagenw - I've reviewed your changes and they look great!
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
This reverts commit 1d0c713.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @hagenw - I've reviewed your changes and they look great!
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only had a minor issue concerning possible developer confusion
relating the terminology of md5sum
and checksum
.
This is resolved as good as it can get. So, approval is being given.
Closes #254
This adds a test for #254 and fixes it by reversing the introduction of
audbackend.checksum()
from #245 as we shouldn't upload the checksum stored in the metadata of the parquet file, but as we did before always use the MD5 sum.This will have no negative effect on version tracking with
audb
(i was wrong when stating this in audeering/audb#459). The only thing affected by uploading the MD5 sum instead of the parquet file hash is: if a user cancels anaudb.publish()
job, deletes the build folder, creates the table files again and starts the upload again, it will overwrite the parquet table files instead of skipping them, as their MD5 sum has changed locally. But this should be fine. If we stay with the current implementation instead, we have no way to verify if a file was uploaded/downloaded correctly as we don't store information on the actual MD5 sum.Summary by Sourcery
Add a test for parquet file upload to Artifactory and fix the issue by reverting to using the MD5 checksum instead of the metadata checksum.
Bug Fixes:
Tests:
Summary by Sourcery
Revert the checksum calculation for parquet files to use the MD5 checksum instead of the metadata checksum to fix upload issues. Add a test to verify the correct checksum is used during the upload process.
Bug Fixes:
Tests: