Skip to content

Commit

Permalink
readme edit
Browse files Browse the repository at this point in the history
  • Loading branch information
AlexandraVolokhova committed Nov 13, 2023
1 parent 436890f commit 0d785d4
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions scripts/conformer/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
This folder contains scripts for dealing with GEOM dataset and running RDKit-base baselines.

## Calculating statistics for GEOM dataset
The script `geom_stats.py` extracts statistical information from molecular conformation data in the GEOM dataset using the RDKit library. The GEOM dataset is expected to be in the "rdkit_folder" format (tutorial and downloading links are here: https://github.com/learningmatter-mit/geom/tree/master). This script parses the dataset, calculates various statistics, and outputs the results to a CSV file.
The script `geom_stats.py` extracts statistical information from molecular conformation data in the GEOM dataset using the RDKit library. The GEOM dataset is expected to be in the "rdkit_folder" format (tutorial and downloading links are here: https://github.com/learningmatter-mit/geom/tree/master). This script parses the dataset, calculates relevant statistics, and outputs the results to a CSV file.

Statistics collected include:
* SMILES representation of the molecule.
* Whether the molecule is self-consistent, i.e. its conformations in the dataset correspond to the same SMILES.
* Whether the the milecule is consistent with the RDKit, i.e. all conformations in the dataset correspond to the same SMILES and this SMILES is the same as stored in the dataset.
* The number of rotatable torsion angles in the molecular conformation (both from GEOM and RDKit).
* Whether the the molecule is consistent with the RDKit, i.e. all conformations in the dataset correspond to the same SMILES and this SMILES is the same as stored in the dataset.
* The number of rotatable torsion angles in the molecular conformation (both from GEOM conformers and RDKit-generated graph).
* Whether the molecule contains hydrogen torsion angles.
* The total number of unique conformations for the molecule.
* The number of heavy atoms in the molecule.
Expand Down

0 comments on commit 0d785d4

Please sign in to comment.