Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: decouple 'curie' uri from 'uri' uri #202

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Silvanoc
Copy link

The URIs assigned to the different types identify the types themselves, as a consequence two different types with the same URI would be expected to be synonyms (different names with the same meaning).

This is the case for the types CURIE and URIorCURIE:

  1. CURIE had the URI 'xsd:anyURI' (the same as for URIs).
  2. URIorCURIE had the URI 'xsd:string' (the same as for strings).

This patch assign both CURIE and URIorCURIE their own unique URIs to differentiate the different types from each other.

The URIs assigned to the different types identify the types themselves,
as a consequence two different types with the same URI would be expected
to be synonyms (different names with the same meaning).

This is the case for the types CURIE and URIorCURIE:
1. CURIE had the URI 'xsd:anyURI' (the same as for URIs).
2. URIorCURIE had the URI 'xsd:string' (the same as for strings).

This patch assign both CURIE and URIorCURIE their own unique URIs to
differentiate the different types from each other.

Signed-off-by: Silvano Cirujano Cuesta <silvano.cirujano-cuesta@siemens.com>
@Silvanoc Silvanoc marked this pull request as draft July 22, 2024 21:22
@matentzn
Copy link

As a concerned citizen I would like to suggest this PR to be reviewed by at least 5 LinkML experts before it is merged. :P Changes to the base types whatever they are make me a bit nervous (thinking of effects on validation, linkml-dumper classes etc, not saying there are consequences, just... putting my worry out there).

@Silvanoc
Copy link
Author

As a concerned citizen I would like to suggest this PR to be reviewed by at least 5 LinkML experts before it is merged. :P Changes to the base types whatever they are make me a bit nervous (thinking of effects on validation, linkml-dumper classes etc, not saying there are consequences, just... putting my worry out there).

@matentzn you are fully right and I share your concern, therefore this PR is in Draft status 😉

IMO this is a meaningful thing to do, because the current status is wrong. The URIs of the LinkML elements are their identifiers and having the same URI on multiple different elements means that they are the same. But URI ≠ CURIE ≠ URIorCURIE, right?

It is clear that when tools have "integrated" the error into their implementation fixing them might be the hardest task and we should not underestimate it.

I am more than happy to discuss:

  1. If it is erroneous as I see it.
  2. What is the impact in the ecosystem.
  3. How to address this issue:
    • Fixing it
    • Ignoring it and expecting following consumers to work-around this issue

@matentzn
Copy link

I totally agree! I think fixing this is a good idea..

@Silvanoc
Copy link
Author

@matentzn I have just realized that this comment from @turbomam is probably signaling one place where this change might break something else.
So, yes, I will not move this PR from Draft until we have the clarity that we are not breaking anything inadvertently. 😄

@mahdanoura
Copy link

mahdanoura commented Jul 28, 2024

@Silvanoc , another type of place which might be affected by your change is in cases where variables of type CURIE are not assigned to that type, rather directly to a string. As an example, in the LinkML SchemaDefinition class,

class_class_curie variable is defined as str. Therefore, changing the CURIE type as you suggest would not have any effect on these str assignments. Something like this would have been expected:
class_name: ClassVar[CURIE]

However, I'm not deep into LinkML's code base and there could have been another reason for this assignment.

@Silvanoc
Copy link
Author

@Silvanoc , another type of place which might be affected by your change is in cases where variables of type CURIE are not assigned to that type, rather directly to a string. As an example, in the LinkML SchemaDefinition class,

class_class_curie variable is defined as str. Therefore, changing the CURIE type as you suggest would not have any effect on these str assignments. Something like this would have been expected: class_name: ClassVar[CURIE]

However, I'm not deep into LinkML's code base and there could have been another reason for this assignment.

Nice catch! I will have to take a deeper look at it.

In general IMO the introduction of CURIEs, as it was done, has lots of issues.

  • Partially due to the ambiguity between CURIEs and URIs under certain conditions.
  • Partially due to the unconventional type "UriOrCurie". JSON-Schema can easily handle it with anyOf, but Pydantic has a problem with it. And being lax by making URI, CURIE and URIorCURIE a string, makes Pydantic happy.
  • In general handling these types is really complex and problematic due to the two above mentioned issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants