-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Major refactor to use dataclasses that loosely follow DataPackage - Resource models #24
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will defer to @phargogh on the code formatting, etc. To me, it seems all of the fields we care about are here and the file formats are represented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a great refactor! I like the use of dataclasses, and the class hierarchy makes sense to me. I left a couple of questions in the PR that I don't think should hold up merging, they're just asking about two edge cases that could easily be added whenever it makes sense to do so.
On the topic of python 3.10+, I think that restriction is just fine to keep. The use of kw-only arguments is more self-documenting and so feels very in line with the spirit of the package.
Anyways, I think I'll merge!
class ArchiveResource(Resource): | ||
"""Class for metadata for an archive resource.""" | ||
|
||
compression: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hypothetically, suppose we had an uncompressed tarfile (so the extension is .tar
, not .tar.gz
). Are we able to handle this case with ArchiveResource
?
class RasterResource(Resource): | ||
"""Class for metadata for a raster resource.""" | ||
|
||
schema: RasterSchema | ||
spatial: SpatialSchema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inheritance hierarchy for rasters and vectors makes sense to me given the database programming model of a vector. Raster Attribute Tables might be another interesting sort of challenge, though ... should we handle RATs in some way? Not necessarily advocating for handling a RAT in a different way than the normal table would be handled, just wanted to ask!
This PR moves us away from Metadata Control Files and
pygeometa
and towards defining our own data models to represent metadata for vectors, rasters, tables, and compressed files.The parent metadata model is
models.Resource
. It loosely follows https://datapackage.org/standard/data-resource/, but also includes some other properties that we liked from MCF.Defining our own models means we can include just we want, in a structure that makes sense to us. For example,
models.ContactSchema
)Is there anything missing from the new models?
@phargogh I'm using a
dataclass
feature (keyword-only args) only available since Python 3.10. It doesn't look like there's an easy workaround, so before trying to make this compatible with early versions, I'm submitting as-is and we can discuss if we think it's critical to support earlier.