Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement data model for metabolite identifiers #8

Open
hredestig opened this issue Sep 5, 2017 · 0 comments
Open

implement data model for metabolite identifiers #8

hredestig opened this issue Sep 5, 2017 · 0 comments

Comments

@hredestig
Copy link
Contributor

hredestig commented Sep 5, 2017

  • design data-model for metabolite identifiers. Suggestion is to create internal metabolite identifier, and attach all new identifiers to that internal identifier, to make all mappings a 1-hop query (query speed decreases rapidly with the number of hops that need to be made)
    • add indices on chebi
    • unique constraints on 'strong' identifiers e.g. inchi / chebi / smiles / cas..?
  • write a proper data-import procedure which adds all identifiers from
    • metanetx (draft implementation existing)
    • use chebi to add to all identifiers
      - annotation: trivial names, mass
      - strong identifiers: inchikey, inchi, with and without hydrogen and charge layer, cas, smiles
      - sync with iloop to make it clear which chebi identifiers are known to iloop
    • own manually curated list of pairs of identifiers
    • merge sets of identifiers to single set if
      - any strong identifier is shared between them.

Would solve #7 and #6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants