Skip to content

Remarkable diversity of alkaloid scaffolds in Piper fimbriulatum

License

Notifications You must be signed in to change notification settings

pluskal-lab/PiperFIM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Remarkable diversity of alkaloid scaffolds in Piper fimbriulatum

This repository contains all the scripts needed to reproduce the data analysis and results of the manuscript "Remarkable diversity of alkaloid scaffolds in Piper fimbriulatum" (https://doi.org/10.1101/2024.12.10.627739).

Requirements

Installation and setup

  1. To install mzmine and SIRIUS, follow the instructions provided in the corresponding online documentation (see mzmine and SIRIUS docs).

  2. Concerning this GitHub repository, clone it by running the following command in your terminal:

git clone https://github.com/pluskal-lab/PiperFIM.git
  1. Create a new conda environment and install packages and dependencies listed in requirements.txt:
conda create -y --name piperfim
conda activate piperfim
conda install --file requirements.txt -y

Alternatively, you can run the activate.sh script:

source activate.sh
  1. Download the data and results folder from Zenodo inside the main repository directory.

Note

Paths and names of all input and output files are listed in the config/config.yaml file and can be changed directly from there.

Usage

LC-MS data analysis

Feature detection with mzmine can be reproduced using the provided batch file (mzmine_featdetect.mzbatch in the scripts folder) as described in Heuckeroth et al. 2024. Feature-based molecular networking (FBMN) on the GNPS2 platform and in silico chemical structure and compound class predictions with the SIRIUS software can be reproduced as described in the original publication.

The 01_lcms_dataprep.py integrates output files from these software tools to facilitate downstream data analysis:

python scripts/01_lcms_dataprep.py

This will produce two output files: ftable_clean.csv (mzmine-like feature table) and ntable_clean.csv (GNPS2-like node table). The first can be used to perform statistical analysis, while the second can be importe in Cytoscape for enhanced exploration of FBMN results.

SPARQL query

The 02_run_sparql_queries.py script runs the SPARQL queries stored in the scripts/sparql_queries folder, clean the results (e.g., remove duplicates) and saves the ouptut in the data/wikidata folder. Queries are designed to retrieve all natural products that contain a specific substructure (defined by a SMILES) together with the plant genera each compound was isolated from, based on Wikidata. Literature references are also retrieved.

python scripts/02_run_sparql_queries.py

The 03_clean_wikidata.py script cleans raw SPARQL query outputs by filtering out "unwanted substructures" erroneous reports in Wikidata as defined in the config.yaml file. Cleaned results are saved in the results/phylo_tree/wikidata_clean folder.

python scripts/03_run_sparql_queries.py

Map SPARQL results onto the Angiosperm tree of life

The 04_create_itol_annotation.py script creates an annotation file (iTOL_scaffolds.txt) to use in iTOL to map literature reports for each alkaloid scaffold (i.e., benzylisoquinoline, aporphine, piperolactam, piperidine, seco-benzylisoquinoline) in each genus covered in the angiorsperm tree of life published by Zuntini et al. 2024 (global_tree_brlen_pruned_renamed.tre file). The resulting tree can be accessed at the following link.

The 05_create_small_tree.py.py script creates a smaller version of the global_tree_brlen_pruned_renamed.tre file by keeping only the orders where at least one alkaloid scaffold was reported. The resulting tree can be accessed at the following link.

About

Remarkable diversity of alkaloid scaffolds in Piper fimbriulatum

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published