Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement data set builders and the new entity/relationship model #610

Merged
merged 79 commits into from
Jan 21, 2025

Conversation

mdekstrand
Copy link
Member

@mdekstrand mdekstrand commented Jan 21, 2025

This adds a more thorough data model to LensKit, along with dataset builders to build more advanced data sets.

Not every feature is implemented yet, but it should be enough for the current code to keep working.

Closes #547. Closes #586. Closes #587.

@mdekstrand mdekstrand added this to the 2025.1 milestone Jan 21, 2025
Copy link

codecov bot commented Jan 21, 2025

Codecov Report

Attention: Patch coverage is 94.76615% with 47 lines in your changes missing coverage. Please review.

Project coverage is 90.32%. Comparing base (01907a8) to head (24046eb).
Report is 80 commits behind head on main.

Files with missing lines Patch % Lines
lenskit/lenskit/data/dataset.py 93.96% 23 Missing ⚠️
lenskit/lenskit/data/builder.py 95.47% 12 Missing ⚠️
lenskit/lenskit/data/schema.py 90.78% 7 Missing ⚠️
lenskit/lenskit/data/container.py 95.55% 2 Missing ⚠️
lenskit/lenskit/data/mtarray.py 75.00% 1 Missing ⚠️
lenskit/lenskit/data/vocab.py 0.00% 1 Missing ⚠️
lenskit/lenskit/splitting/split.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #610      +/-   ##
==========================================
+ Coverage   90.04%   90.32%   +0.27%     
==========================================
  Files         108      109       +1     
  Lines        6814     7269     +455     
==========================================
+ Hits         6136     6566     +430     
- Misses        678      703      +25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mdekstrand mdekstrand merged commit da224a4 into lenskit:main Jan 21, 2025
47 checks passed
@mdekstrand mdekstrand deleted the feature/data-set-builder branch January 21, 2025 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Data management support.
Projects
Status: Done
1 participant