Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider reconstructing fractional coordinates from distance matrix (via MDS or neural network) #83

Open
sgbaird opened this issue Jun 11, 2022 · 6 comments
Labels
hyperparameter Hyperparameters to consider optimizing

Comments

@sgbaird
Copy link
Member

sgbaird commented Jun 11, 2022

i.e. using multi-dimensional scaling (MDS), or could be a trained network. If a trained network, an interesting approach might be to map the representation (including redundant information) to the directly used non-redundant inputs Structure(lattice, elements, coords).

If using MDS, the hyperparameter would probably be the weighted average of the direct fractional coordinates and the reconstructed fractional coordinates, so a scalar hyperparameter between 0 and 1.

See also:

Anand, N.; Eguchi, R.; Huang, P.-S. Fully Differentiable Full-Atom Protein Backbone Generation. 2019. https://openreview.net/forum?id=SJxnVL8YOV

which uses pairwise distance matrix reconstruction.

Good discussion and examples about reconstructing directly vs. using a neural network in:

(1) Ovchinnikov, S.; Huang, P.-S. Structure-Based Protein Design with Deep Learning. Current Opinion in Chemical Biology 2021, 65, 136–144. https://doi.org/10.1016/j.cbpa.2021.08.004.

@sgbaird sgbaird added the hyperparameter Hyperparameters to consider optimizing label Jun 11, 2022
@kjappelbaum
Copy link
Contributor

Lilienfeld et al. also generated distance matrices using an ML model they used dgsol to go from distance matrix to coordinates. Here's the download https://www.mcs.anl.gov/~more/dgsol/. I didn't see any Python bindings though.

@sgbaird
Copy link
Member Author

sgbaird commented Jun 11, 2022

Interesting that it supports using sparse pairwise distance matrices. Looking at the citations to one of the early dgsol papers, I'm realizing how rich the literature is for this topic, but a bit disappointed by the sparsity (esp. in Python) on GitHub [1][2]. The idea of being able to constrain based on lower and upper bounds and uncertainties came up in the context of molecular reconstruction.

I guess there's a CMake version of dgsol. I'm vaguely familiar with building external software (e.g. C++ code) as a part of packaging conda packages. dgsol or similar might still be worth exploring.

EDIT: https://github.com/wjakob/nanobind_example could be helpful. Not sure.

@sgbaird
Copy link
Member Author

sgbaird commented Jun 20, 2022

Implementing this feature would also require having an origin and some kind of alignment/orientation relative to the unit cell. Hmm..

#119

@sgbaird
Copy link
Member Author

sgbaird commented Jun 23, 2022

Alternative would be to use site-to-site x,y,z vectors as RGB encodings rather than a pairwise distance matrix, which is what @michaeldalverson is doing right now.

@kjappelbaum
Copy link
Contributor

Implementing this feature would also require having an origin and some kind of alignment/orientation relative to the unit cell. Hmm..

Once you have the fractional coordinates and the coordinates from the pairwise distance matrix you can compute, e.g. using the Kabsch algorithm, the optimal rotation onto each other.
However, not sure how what one should do if they do not match up: Average, take one (which one?), fail if the disagreement is too large ...

@sgbaird
Copy link
Member Author

sgbaird commented Jul 15, 2022

Came across another repo that might be of interest here: https://github.com/stevenygd/PointFlow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hyperparameter Hyperparameters to consider optimizing
Projects
None yet
Development

No branches or pull requests

2 participants