implement data model for metabolite identifiers #8

hredestig · 2017-09-05T11:52:26Z

design data-model for metabolite identifiers. Suggestion is to create internal metabolite identifier, and attach all new identifiers to that internal identifier, to make all mappings a 1-hop query (query speed decreases rapidly with the number of hops that need to be made)
- add indices on chebi
- unique constraints on 'strong' identifiers e.g. inchi / chebi / smiles / cas..?
write a proper data-import procedure which adds all identifiers from
- metanetx (draft implementation existing)
- use chebi to add to all identifiers
  - annotation: trivial names, mass
  - strong identifiers: inchikey, inchi, with and without hydrogen and charge layer, cas, smiles
  - sync with iloop to make it clear which chebi identifiers are known to iloop
- own manually curated list of pairs of identifiers
- merge sets of identifiers to single set if
  - any strong identifier is shared between them.

Would solve #7 and #6

hredestig added epic: compound-name-flexibility prioritised labels Sep 5, 2017

kvikshaug removed the epic: compound-name-flexibility label Sep 7, 2018

Provide feedback