-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added hub validation #10
Conversation
…ks.json, and hub model-metadata-schema.json. renamed schema.py -> ptc_schema.py. removed failing test_hub_config_complex_scenario_hub(). removed default 'r' arg to open(). added tests/hubs/invalid-ptc-config-hub/hub-config/tasks.json so it would pass new checks. ditto for tests/hubs/example-complex-scenario-hub/hub-config/predtimechart-config.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see this coming along.
I have long-term concerns about putting in some guardrails against how this code might be affected by an upstream schema change, but since predtime chart is only designed for a very specific type of task, we're probably ok for the near future.
The the change I do think we should make is bunching together error messages in the validations, so people don't have to make one change at a time to get this working (i.e., they get a list of everything that didn't validate so they can fix it all up)
@@ -8,7 +8,7 @@ | |||
from jsonschema import validate | |||
from jsonschema.exceptions import ValidationError | |||
|
|||
from hub_predtimechart.schema import ptc_config_schema | |||
from hub_predtimechart.ptc_schema import ptc_config_schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this name change--explicitly distinguishing between various schemas alleviates some prior confusion!
@@ -31,41 +31,47 @@ def __init__(self, hub_dir: Path, ptc_config_file: Path): | |||
|
|||
self.hub_dir = hub_dir | |||
|
|||
# get tasks.json file content | |||
with open(self.hub_dir / 'hub-config' / 'tasks.json') as fp: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a today thing, but some of the code below made me realize we should think about how to use or port some of the hub schema reading and parsing functions in packages like hubAdmin
and hubUtil
.
If the Hubverse ever makes a breaking change to the tasks.json schema, any code based on assumptions of its structure could break.
For example:
model_tasks_ele = tasks['rounds'][self.rounds_idx]['model_tasks'][self.model_tasks_idx]
# validate: only `model_tasks` groups with `quantile` output_types will be considered | ||
output_type = the_model_task['output_type'] | ||
if 'quantile' not in output_type: | ||
raise ValidationError(f"no quantile output_type found. found types: {list(output_type.keys())}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are multiple validation errors (e.g., missing quantile levels and not a designated model), it would be user-friendly to return all of them at once (instead of raising an error on the first failure).
raise ValidationError(f"more than one unique `target_metadata` key found: {target_keys_keys}") | ||
|
||
|
||
def _validate_predtimechart_config(ptc_config: dict, tasks: dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same note here about returning a holistic list of errors so people don't have to solve them one by one.
with open('tests/hubs/example-complex-forecast-hub/hub-config/predtimechart-config.yml') as fp: | ||
valid_ptc_config = yaml.safe_load(fp) | ||
ecfh_ptc_config = yaml.safe_load(fp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious....what is ecfh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
example-complex-forecast-hub
Thanks for your comments, Becky. I created this issue in response to them: #11 |
This PR implements the checks documented in the "Assumptions/limitations" README.md section.
The main code changes:
_validate_hub_ptc_compatibility()
: new function that validates the hub's files_validate_predtimechart_config()
: added validation that goes beyond the JSON Schema'svalidate()
callThe main test changes:
test_hub_config_complex_scenario_hub()
test__validate_predtimechart_config()
: added new tests. double underscores in the name b/c the function under test starts with an underscoretest__validate_hub_ptc_compatibility()
: new. same note re: double underscoresAdditionally:
schema.py
->ptc_schema.py
open()
tests/hubs/invalid-ptc-config-hub/hub-config/tasks.json
so it would pass new checkstests/hubs/example-complex-scenario-hub/hub-config/predtimechart-config.yml