training data setup #5

sacombs · 2022-10-27T17:56:16Z

I would like to provide my own datasest for retraining torisional-diffusion. There are some things that I do not know what value to put in for the pickle file. For example, the conformers dictionary has the following:

{'geom_id': 123368967, 'set': 1, 'degeneracy': 3, 'totalenergy': -23.59133734, 'relativeenergy': 0.0, 'boltzmannweight': 0.8585, 'conformerweights': [0.28617, 0.28617, 0.28616], 'rd_mol': <rdkit.Chem.rdchem.Mol at 0x7f7b42014bd0>}

What should I put for boltzmannweight and degeneracy? Is there a setup script to take molfiles and convert them into the dataset for training?

The text was updated successfully, but these errors were encountered:

MatthewMasters · 2022-11-16T08:31:50Z

Degeneracy is not used in the code so it's safe to exclude. The boltzmann weight can be calculated if you know the energy and temperature since w = exp(-E/kbT) where E=energy, T=temperature, and kb=boltzmann constant.

gcorso · 2022-11-16T19:24:27Z

Sorry for the delay and thank you very much @MatthewMasters for the answer!
All Matthew said is correct, moreover, if you don't use the Boltzmann weighted sampling (this is the way the ML community trains and evaluates these methods) you only need to have the rd_mol!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training data setup #5

training data setup #5

sacombs commented Oct 27, 2022

MatthewMasters commented Nov 16, 2022

gcorso commented Nov 16, 2022

training data setup #5

training data setup #5

Comments

sacombs commented Oct 27, 2022

MatthewMasters commented Nov 16, 2022

gcorso commented Nov 16, 2022