Skip to content

for Open Enventory using python to automatically update sql queries in 'molecule' table with mol files

License

Notifications You must be signed in to change notification settings

khoivan88/oe_find_structure

Repository files navigation

Python 3 Updates codecov python version

FIND MISSING STRUCTURE FOR CHEMICALS IN OPEN ENVENTORY

This program is designed specifically for Open Enventory to fix issue with molecules missing structures (could not be extracted through "Read data from supplier")

DETAILS

This programs does:

  1. Connect into mysql database and find molecule in 'molecule' table of specific database and find those molecule with missing structure (smiles)
  2. folder "missing_mol_files" needs to be created inside /var/lib/mysql with 'mysql' as ownner (chown mysql:mysql)
  3. Try to download mol files from various sources into a folder in /var/lib/mysql/missing_mol_files
  4. Update those sql entries with new downloaded mol_files

REQUIREMENTS

  • root access to the server hosting Open Enventory
  • Python 3+
  • conda (Optional)
    • conda is used to install rdkit and molvs to clean mol files (e.g. convert explicit hydrogens to implicit hydrogens, etc.)
    • If you don't already have conda, you can install it using the following link. If you are not sure what to install, I suggest you install miniconda3 and NOT anaconda3 for much smaller package footprint.
  • This file is made for Linux environment, you should be able to used it on other OS by changing the location of the "download_path"

USAGE

  1. Clone this repository:

    git clone https://github.com/khoivan88/oe_find_structure.git    #if you have git
    # if you don't have git, you can download the zip file then unzip
  2. Change into the directory of the program:

    cd oe_find_structure

Without conda installed:

Skip ahead to this if you have conda installed.

  1. (Optional): create virtual environment for python to install dependency:

    # you can change "update_sql_mol_venv" to other name too
    python3 -m venv oe_find_structure_venv   # Create virtual environment
    source oe_find_structure_venv/bin/activate    # Activate the virtual environment on Linux
    # oe_find_structure_venv\Scripts\activate    # Activate the virtual environment on Windows
  2. Install python dependencies:

    pip install -r requirements.txt   # Install all dependencies (without rdkit and molvs)

With conda installed:

Instead of step 3 AND step 4 above, if you have conda installed, you can do this instead:

conda env create --prefix oe_find_structure_conda-env --file ./environment.yml    # Create virtual  environment with conda and install all dependancies
conda activate ./oe_find_structure_conda-env    # Activate the virtual environment
  1. Run the program:

    python oe_find_structure/find_structure.py
    • Answer questions for:
      • mySQL root password (typing password will not be shown on screen)
      • the name of the database you want to update (twice to confirm)
      • url path for your Open Enventory server (including 'http/https' and no trailing '/')

    You can enable debug mode (more error printing during structure search) by adding '-d' :

    python oe_find_structure/find_structure.py -d    # Enable debug mode

About

for Open Enventory using python to automatically update sql queries in 'molecule' table with mol files

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages