This program is designed specifically for Open Enventory to fix issue with molecules missing structures (could not be extracted through "Read data from supplier")
This programs does:
- Connect into mysql database and find molecule in 'molecule' table of specific database and find those molecule with missing structure (smiles)
- folder "missing_mol_files" needs to be created inside /var/lib/mysql with 'mysql' as ownner (chown mysql:mysql)
- Try to download mol files from various sources into a folder in
/var/lib/mysql/missing_mol_files
- Update those sql entries with new downloaded mol_files
- root access to the server hosting Open Enventory
- Python 3+
- conda (Optional)
- conda is used to install
rdkit
andmolvs
to clean mol files (e.g. convert explicit hydrogens to implicit hydrogens, etc.) - If you don't already have conda, you can install it using the following link. If you are not sure what to install, I suggest you install
miniconda3
and NOTanaconda3
for much smaller package footprint.
- conda is used to install
- This file is made for Linux environment, you should be able to used it on other OS by changing the location of the "download_path"
-
Clone this repository:
git clone https://github.com/khoivan88/oe_find_structure.git #if you have git # if you don't have git, you can download the zip file then unzip
-
Change into the directory of the program:
cd oe_find_structure
Skip ahead to this if you have conda installed.
-
(Optional): create virtual environment for python to install dependency:
# you can change "update_sql_mol_venv" to other name too python3 -m venv oe_find_structure_venv # Create virtual environment source oe_find_structure_venv/bin/activate # Activate the virtual environment on Linux # oe_find_structure_venv\Scripts\activate # Activate the virtual environment on Windows
-
Install python dependencies:
pip install -r requirements.txt # Install all dependencies (without rdkit and molvs)
Instead of step 3 AND step 4 above, if you have conda installed, you can do this instead:
conda env create --prefix oe_find_structure_conda-env --file ./environment.yml # Create virtual environment with conda and install all dependancies
conda activate ./oe_find_structure_conda-env # Activate the virtual environment
-
Run the program:
python oe_find_structure/find_structure.py
- Answer questions for:
- mySQL root password (typing password will not be shown on screen)
- the name of the database you want to update (twice to confirm)
- url path for your Open Enventory server (including 'http/https' and no trailing '/')
You can enable debug mode (more error printing during structure search) by adding '
-d
' :python oe_find_structure/find_structure.py -d # Enable debug mode
- Answer questions for: