Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix mlm:artifact_type check missing #52

Merged
15 commits merged into from
Nov 22, 2024
Merged

fix mlm:artifact_type check missing #52

15 commits merged into from
Nov 22, 2024

Conversation

fmigneault
Copy link
Collaborator

Description

  • Fix missing mlm:artifact_type property check for a Model Asset definition.
    The mlm:artifact_type is now mutually and exclusively required by the corresponding Asset with mlm:model role.
  • Fix the examples with the corresponding mlm:artifact_type check now enforced.
  • Removed "Enum" from "Artifact Type" title since it is not actually an enum.

Related Issue

Type of Change

  • 📚 Examples, docs, tutorials or dependencies update;
  • 🔧 Bug fix (non-breaking change which fixes an issue);
  • 🥂 Improvement (non-breaking change which improves an existing feature);
  • 🚀 New feature (non-breaking change which adds functionality);
  • 💥 Breaking change (fix or feature that would cause existing functionality to change);
  • 🔐 Security fix.

Note

"Potentially" breaking change if the MLM definition was not formed as described by the README, but not breaking if the worded requirement was applied although unchecked.

Checklist

  • I've read the CONTRIBUTING.md guide;
  • I've updated the code style using make check;
  • I've written tests for all new methods and classes that I created;
  • I've written the docstring in Google format for all the methods and classes that I used.

@fmigneault fmigneault requested a review from rbavery November 1, 2024 21:18
@fmigneault fmigneault self-assigned this Nov 1, 2024
@fmigneault
Copy link
Collaborator Author

fmigneault commented Nov 6, 2024

This seems to work correctly when doing item validations via pystac.
However, for some reason, the STAC-node-validator hangs with the referenced schema:

Fix found. Waiting on release to tag and use it...

@m-mohr
Copy link

m-mohr commented Nov 6, 2024

Commenting here due to the stac-node-validator issue.

I think all of the STAC or STAC extension schemas (except for MLM afaik) are using draft-07. I'd be careful with adding a new schema version to decrease chances of issues. You are facing them with ajv right now in stac-node validator and I'm not sure whether you have checked other validators than pystac? Like in C# or so? I think I'd try to avoid mixing schema versions, especially newer ones for the sake of avoiding potential issues downstream.

JSON Schema checks in many places throughout STAC are not really complete anyway (e.g. summaries) so not sure it's that important to use the unevaluatedProperties? I think I personally would rather remove the unevaluatedProperties in this case as it's ony for checking for typos / additional mlm: fields, but not actually needed for actual field validations.

Alternatively, is there a way you can dereference the schema before publishing so that you don't need unevaluatedProperties?

@fmigneault
Copy link
Collaborator Author

@m-mohr
Ideally, I would like to keep draft-07. AFAIK, there's no way to do an allOf of multiple schemas with a similar mlm: prefix and doing the check of unknown (?!mlm:) properties, like every extension does.

If you know a way, please let me know. I've tried for multiple hours, and I cannot find an alternative approach to unevaluatedProperties beside manually copy-pasting all fields involved in each allOf sub-object into a single one.

@m-mohr
Copy link

m-mohr commented Nov 6, 2024

This seems to work:

{89F7F8E6-4FEC-48D7-95E6-7CDAA49E2680}

It needs you to repeat the property names in the patternProperties though, but at least only once as you can also $ref the patternProperties itself in allOf.

But I think I'd just remove the check alltogether if you are not fancy to repeat the values.

Code:

{
    "$schema": "http://json-schema.org/draft-07/schema#",
    "type": "object",
    "allOf": [
        {"$ref": "#/definitions/fields"},
        {"$ref": "#/definitions/mlm_field_not_found"}
    ],
    "definitions": {
    	"fields": {
            "properties": {
                "mlm:field1": {},
                "mlm:field2": {},
            }
        },
        "mlm_field_not_found": {
            "patternProperties": {
                "^mlm:(?!(field1|field2)).+$": { "not": {} }
            }
        }
    }
}

@fmigneault
Copy link
Collaborator Author

Yeah. Adding the fields in the regex this way is equivalent to duplicating the definition rather than using allOf to combine them. Not easily maintainable/scalable with dozens of fields.

@m-mohr
Copy link

m-mohr commented Nov 6, 2024

It's not duplicating the schema for the properties though. It's just the names, so comparably short. The same applies partially with the required fields. But yeah, all options are not ideal. I think I'd remove the check. There are more extension that don't check it.

@fmigneault
Copy link
Collaborator Author

fmigneault commented Nov 6, 2024

@m-mohr
Here's what I have to do to replicate the behavior: https://github.com/stac-extensions/mlm/pull/58/files

The check cannot be removed because I explicitly disallow certain fields combinations to redefine under assets/items as they would introduce misinterpretation of the contents.

@m-mohr
Copy link

m-mohr commented Nov 6, 2024

That's much more duplication compared to #52 (comment) though. Am I missing something?

@fmigneault
Copy link
Collaborator Author

Not really. I would have to copy every field name within the regex, and maintain both regexes with their separate combinations of fields. It makes it harder to read and more prone to errors. At least, schema references are flagged by IDEs if they are invalid.

@fmigneault
Copy link
Collaborator Author

fmigneault commented Nov 6, 2024

Just got an idea worth trying.
Maybe I can define all possible MLM fields in a single object, and then wrap that in an allOf that negates disallowed combination.

{
  "allOf": [
    {"$ref": "#/definitions/all-fields"},
    {"not": {"required": ["mlm:disallowed"]}}
  ],
  "definitions": {
    "all-fields": {
       "properties": {
          "mlm:allowed": {},
          "mlm:disallowed": {}
       },
       "patternProperties": {
          "^(?!mlm:)": {}
       },
       "additionalProperties": false
    }
  }
}

Seems to work...
{E8884F4F-7D6E-432B-AA7C-7956070DF181}

@fmigneault fmigneault closed this pull request by merging all changes into main in 42052ec Nov 22, 2024
@fmigneault fmigneault deleted the fix-mlm-artefact branch November 22, 2024 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix missing mlm:artifact_type in JSON schema
2 participants