-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle CSV datasets & better handle updates to existing MCF #12
Conversation
…not overwrite existing data.
…f passing **get_license() to set_license().
…a pre-existing MCF.
self.mcf['metadata']['datestamp'] = datetime.utcnow( | ||
).strftime('%Y-%m-%d') | ||
# fill all values that can be derived from the dataset | ||
self._set_spatial_info() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making sure to call this whether or not an MCF already exists, in case properties of the dataset have changed.
@@ -418,9 +432,12 @@ def set_band_description(self, band_number, name=None, title=None, abstract=None | |||
title (str): title for the raster band | |||
abstract (str): description of the raster band | |||
units (str): unit of measurement for the band's pixel values | |||
type (str): of the band's values, either 'integer' or 'number' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like using type
as the argname, since it's a builtin function name, but that is the name of the MCF property, and making the setter's argnames match makes it convenient to do things like
mc_a.set_band_description(1, **mc_b.get_band_description(1))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On this note, is it possible to automatically populate the band's min/max in the dimensions section? Dimensions correspond to this feature and seem to be required in the MCF, but not for ISO. I know there is a way to get this information in GDAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@empavia do you mean the min/max pixel values, or the number of rows and columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phargogh the min/max pixels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phargogh actually I believe I am referring to the min/max of the band measurement. For example, the min/max elevation in meters for a DEM. An example is found in this document on Oak: "Z:\global-dataset-cache\awc-isric-soilgrids\awc.tif.ovr.aux.xml"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from geometamaker import MetadataControl | ||
|
||
title = 'Title' | ||
keyword = 'foo' | ||
band_name = 'The Band' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎸
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I appreciate that we can just throw a CSV at GDAL and then use the same API.
Have you tested what happens if we pass GDAL a malformed CSV, like one with too many records on a line or something like that?
Good question! It seems like GDAL uses the header row as the source of truth for number of fields, and then it ignores extra records in other lines. That seems totally fine. I don't feel like we need a test to ensure this. |
This PR adds support for
.csv
files, by treating them like other vector datasets that GDAL supports.A couple other things here:
MetadataControl
is initialized. The goal is to preserve any metadata that might already exist, except for properties that are intrinsic to the dataset (i.e. CRS, bounding box, field names & datatypes).mc_a.set_license(**mc_b.get_license())