Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix mlm:artifact_type check missing #52

Merged
15 commits merged into from
Nov 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
for the entire input (but still possible).

### Changed
- Use JSON `$schema` version `2019-09` to allow use of `unevaluatedProperties` for stricter validation of MLM fields.
- Explicitly disallow `mlm:name`, `mlm:input`, `mlm:output` and `mlm:hyperparameters` at the Asset level.
These fields describe the model as a whole and should therefore be defined in Item properties.
- Moved `norm_type` to `value_scaling` object to better reflect the expected operation, which could be another
operation than what is typically known as "normalization" or "standardization" techniques in machine learning.
- Moved `statistics` to `value_scaling` object to better reflect their mutual `type` and additional
Expand All @@ -34,6 +37,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
Otherwise, the amount of `value_scaling` objects should match the number of bands or channels involved in the input.

### Fixed
- Fix missing `mlm:artifact_type` property check for a Model Asset definition
(fixes <https://github.com/stac-extensions/mlm/issues/42>).
The `mlm:artifact_type` is now mutually and exclusively required by the corresponding Asset with `mlm:model` role.
- Fix check of disallowed unknown/undefined `mlm:`-prefixed fields
(fixes [#41](https://github.com/stac-extensions/mlm/issues/41)).

Expand Down
86 changes: 50 additions & 36 deletions README.md

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions examples/item_bands_expression.json
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,7 @@
"mlm:model",
"mlm:weights"
],
"mlm:artifact_type": "torch.save",
"$comment": "Following 'eo:bands' is required to fulfil schema validation of 'eo' extension.",
"eo:bands": [
{
Expand Down
3 changes: 2 additions & 1 deletion examples/item_basic.json
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,8 @@
"type": "text/html",
"roles": [
"mlm:model"
]
],
"mlm:artifact_type": "torch.save"
}
},
"links": [
Expand Down
2 changes: 1 addition & 1 deletion examples/item_eo_and_raster_bands.json
Original file line number Diff line number Diff line change
Expand Up @@ -508,6 +508,7 @@
"mlm:model",
"mlm:weights"
],
"mlm:artifact_type": "torch.save",
"$comment": "Following 'eo:bands' is required to fulfil schema validation of 'eo' extension.",
"eo:bands": [
{
Expand Down Expand Up @@ -557,7 +558,6 @@
"description": "Source code to run the model.",
"type": "text/x-python",
"roles": [
"mlm:model",
"code",
"metadata"
]
Expand Down
1 change: 1 addition & 0 deletions examples/item_eo_bands.json
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,7 @@
"mlm:model",
"mlm:weights"
],
"mlm:artifact_type": "torch.save",
"$comment": "Following 'eo:bands' is required to fulfil schema validation of 'eo' extension.",
"eo:bands": [
{
Expand Down
2 changes: 1 addition & 1 deletion examples/item_eo_bands_summarized.json
Original file line number Diff line number Diff line change
Expand Up @@ -377,6 +377,7 @@
"mlm:model",
"mlm:weights"
],
"mlm:artifact_type": "torch.save",
"$comment": "Following 'eo:bands' is required to fulfil schema validation of 'eo' extension.",
"eo:bands": [
{
Expand Down Expand Up @@ -426,7 +427,6 @@
"description": "Source code to run the model.",
"type": "text/x-python",
"roles": [
"mlm:model",
"code",
"metadata"
]
Expand Down
1 change: 1 addition & 0 deletions examples/item_multi_io.json
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,7 @@
"mlm:model",
"mlm:weights"
],
"mlm:artifact_type": "torch.save",
"raster:bands": [
{
"name": "B02 - blue",
Expand Down
1 change: 1 addition & 0 deletions examples/item_raster_bands.json
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,7 @@
"mlm:model",
"mlm:weights"
],
"mlm:artifact_type": "torch.save",
"raster:bands": [
{
"name": "B01",
Expand Down
180 changes: 135 additions & 45 deletions json-schema/schema.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$schema": "https://json-schema.org/draft/2019-09/schema#",
"$id": "https://stac-extensions.github.io/mlm/v1.3.0/schema.json",
"title": "Machine Learning Model STAC Extension Schema",
"description": "This object represents the metadata for a Machine Learning Model (MLM) used in STAC documents.",
Expand All @@ -20,16 +20,19 @@
"then": {
"allOf": [
{
"$comment": "Schema to validate the MLM fields under Item properties or Assets properties.",
"description": "Schema to validate the MLM fields permitted under Item properties or Assets properties.",
"type": "object",
"required": [
"properties",
"assets"
],
"properties": {
"properties": {
"description": "Schema to validate the MLM fields permitted under Item properties.",
"$comment": "Allow properties not defined by MLM prefix to work with other extensions and attributes, but disallow undefined MLM fields.",
"allOf": [
{
"type": "object",
"required": [
"mlm:name",
"mlm:architecture",
Expand All @@ -39,18 +42,26 @@
]
},
{
"$ref": "#/$defs/fields"
"$ref": "#/$defs/mlmItemFields"
},
{
"patternProperties": {
"^(?!mlm:)": {}
}
}
]
],
"unevaluatedProperties": false
},
"assets": {
"type": "object",
"additionalProperties": {
"allOf": [
{
"$ref": "#/$defs/fields"
}
]
"description": "Schema to validate the MLM fields permitted only under Assets properties.",
"$comment": "Allow properties not defined by MLM prefix to work with other extensions and attributes, but disallow undefined MLM fields.",
"$ref": "#/$defs/mlmAssetFields",
"patternProperties": {
"^(?!mlm:)": {}
},
"unevaluatedProperties": false
}
}
}
Expand All @@ -63,8 +74,12 @@
"$ref": "#/$defs/AnyBandsRef"
},
{
"$comment": "Schema to validate model role requirement.",
"$comment": "Schema to validate that at least one Asset defines a model role.",
"$ref": "#/$defs/AssetModelRoleMinimumOneDefinition"
},
{
"$comment": "Schema to validate that the Asset model properties are mutually exclusive to the model role.",
"$ref": "#/$defs/AssetModelRequiredProperties"
}
]
}
Expand All @@ -89,19 +104,19 @@
"summaries": {
"type": "object",
"additionalProperties": {
"$ref": "#/$defs/fields"
"$ref": "#/$defs/mlmCollectionFields"
}
},
"assets": {
"type": "object",
"additionalProperties": {
"$ref": "#/$defs/fields"
"$ref": "#/$defs/mlmAssetFields"
}
},
"item_assets": {
"type": "object",
"additionalProperties": {
"$ref": "#/$defs/fields"
"$ref": "#/$defs/mlmAssetFields"
}
}
}
Expand Down Expand Up @@ -257,12 +272,10 @@
}
}
},
"fields": {
"mlmSharedFields": {
"description": "MLM fields that apply at any level (Collection, Item, Asset, Link).",
"type": "object",
"properties": {
"mlm:name": {
"$ref": "#/$defs/mlm:name"
},
"mlm:architecture": {
"$ref": "#/$defs/mlm:architecture"
},
Expand Down Expand Up @@ -301,6 +314,30 @@
},
"mlm:accelerator_count": {
"$ref": "#/$defs/mlm:accelerator_count"
}
}
},
"mlmCollectionFields": {
"$ref": "#/$defs/mlmSharedFields"
},
"mlmItemFields": {
"description": "MLM fields that apply at any level (Collection, Item, Asset, Link).",
"type": "object",
"allOf": [
{
"$ref": "#/$defs/mlmItemOnlyFields"
},
{
"$ref": "#/$defs/mlmSharedFields"
}
]
},
"mlmItemOnlyFields": {
"description": "MLM fields that apply at any level (Collection, Item, Asset, Link).",
"type": "object",
"properties": {
"mlm:name": {
"$ref": "#/$defs/mlm:name"
},
"mlm:input": {
"$ref": "#/$defs/mlm:input"
Expand All @@ -311,12 +348,26 @@
"mlm:hyperparameters": {
"$ref": "#/$defs/mlm:hyperparameters"
}
},
"$comment": "Allow properties not defined by MLM prefix to allow combination with other extensions.",
"patternProperties": {
"^(?!mlm:)": {}
},
"additionalProperties": false
}
},
"mlmAssetFields": {
"allOf": [
{
"$ref": "#/$defs/mlmSharedFields"
},
{
"$ref": "#/$defs/mlmAssetOnlyFields"
}
]
},
"mlmAssetOnlyFields": {
"description": "MLM fields that apply only within an Asset.",
"type": "object",
"properties": {
"mlm:artifact_type": {
"$ref": "#/$defs/mlm:artifact_type"
}
}
},
"mlm:name": {
"type": "string",
Expand Down Expand Up @@ -369,6 +420,15 @@
"type": "string",
"pattern": "^(0|[1-9]\\d*)\\.(0|[1-9]\\d*)\\.(0|[1-9]\\d*)(?:-((?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\\.(?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\\+([0-9a-zA-Z-]+(?:\\.[0-9a-zA-Z-]+)*))?$"
},
"mlm:artifact_type": {
"type": "string",
"minLength": 1,
"examples": [
"torch.save",
"torch.jit.save",
"torch.export.save"
]
},
"mlm:tasks": {
"type": "array",
"uniqueItems": true,
Expand Down Expand Up @@ -845,6 +905,57 @@
"DataType": {
"$ref": "https://stac-extensions.github.io/raster/v1.1.0/schema.json#/definitions/bands/items/properties/data_type"
},
"HasArtifactType": {
"$comment": "Used to check the artifact type property that is required by a Model Asset annotated by 'mlm:model' role.",
"type": "object",
"required": [
"mlm:artifact_type"
],
"properties": {
"mlm:artifact_type": {
"$ref": "#/$defs/mlm:artifact_type"
}
}
},
"AssetModelRole": {
"$comment": "Used to check the presence of 'mlm:model' role required by a Model Asset.",
"type": "object",
"required": [
"roles"
],
"properties": {
"roles": {
"type": "array",
"contains": {
"const": "mlm:model"
},
"minItems": 1
}
}
},
"AssetModelRequiredProperties": {
"$comment": "Asset containing the model definition must indicate both the 'mlm:model' role and an artifact type.",
"required": [
"assets"
],
"properties": {
"assets": {
"additionalProperties": {
"if": {
"$ref": "#/$defs/AssetModelRole"
},
"then": {
"$ref": "#/$defs/HasArtifactType"
},
"else": {
"not": {
"$ref": "#/$defs/HasArtifactType"
}
}
}
}
}
},
"AssetModelRoleMinimumOneDefinition": {
"$comment": "At least one Asset must provide the model definition indicated by the 'mlm:model' role.",
"required": [
Expand All @@ -855,15 +966,7 @@
"properties": {
"assets": {
"additionalProperties": {
"properties": {
"roles": {
"type": "array",
"items": {
"const": "mlm:model"
},
"minItems": 1
}
}
"$ref": "#/$defs/AssetModelRole"
}
}
}
Expand Down Expand Up @@ -891,19 +994,6 @@
}
]
},
"AssetModelRole": {
"required": [
"roles"
],
"properties": {
"roles": {
"contains": {
"type": "string",
"const": "mlm:model"
}
}
}
},
"ModelBands": {
"description": "List of bands (if any) that compose the input. Band order represents the index position of the bands.",
"$comment": "No 'minItems' here to support model inputs not using any band (other data source).",
Expand Down
1 change: 1 addition & 0 deletions stac_model/examples.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,7 @@ def eurosat_resnet() -> ItemMLModelExtension:
"mlm:weights",
"data",
],
extra_fields={"mlm:artifact_type": "torch.save"}
),
"source_code": pystac.Asset(
title="Model implementation.",
Expand Down
Loading
Loading