Remarkable diversity of alkaloid scaffolds in Piper fimbriulatum

This repository contains all the scripts needed to reproduce the data analysis and results of the manuscript "Remarkable diversity of alkaloid scaffolds in Piper fimbriulatum" (https://doi.org/10.1101/2024.12.10.627739).

Requirements

mzmine software (v4.2.0)
SIRIUS software (v5.8.5)
GNPS2 online platform
Miniconda/Anaconda
Python 3.11.0 or higher

Installation and setup

To install mzmine and SIRIUS, follow the instructions provided in the corresponding online documentation (see mzmine and SIRIUS docs).
Concerning this GitHub repository, clone it by running the following command in your terminal:

git clone https://github.com/pluskal-lab/PiperFIM.git

Create a new conda environment and install packages and dependencies listed in requirements.txt:

conda create -y --name piperfim
conda activate piperfim
conda install --file requirements.txt -y

Alternatively, you can run the activate.sh script:

source activate.sh

Download the data and results folder from Zenodo inside the main repository directory.

Note

Paths and names of all input and output files are listed in the config/config.yaml file and can be changed directly from there.

Usage

LC-MS data analysis

Feature detection with mzmine can be reproduced using the provided batch file (mzmine_featdetect.mzbatch in the scripts folder) as described in Heuckeroth et al. 2024. Feature-based molecular networking (FBMN) on the GNPS2 platform and in silico chemical structure and compound class predictions with the SIRIUS software can be reproduced as described in the original publication.

The 01_lcms_dataprep.py integrates output files from these software tools to facilitate downstream data analysis:

python scripts/01_lcms_dataprep.py

This will produce two output files: ftable_clean.csv (mzmine-like feature table) and ntable_clean.csv (GNPS2-like node table). The first can be used to perform statistical analysis, while the second can be importe in Cytoscape for enhanced exploration of FBMN results.

SPARQL query

The 02_run_sparql_queries.py script runs the SPARQL queries stored in the scripts/sparql_queries folder, clean the results (e.g., remove duplicates) and saves the ouptut in the data/wikidata folder. Queries are designed to retrieve all natural products that contain a specific substructure (defined by a SMILES) together with the plant genera each compound was isolated from, based on Wikidata. Literature references are also retrieved.

python scripts/02_run_sparql_queries.py

The 03_clean_wikidata.py script cleans raw SPARQL query outputs by filtering out "unwanted substructures" erroneous reports in Wikidata as defined in the config.yaml file. Cleaned results are saved in the results/phylo_tree/wikidata_clean folder.

python scripts/03_run_sparql_queries.py

Map SPARQL results onto the Angiosperm tree of life

The 04_create_itol_annotation.py script creates an annotation file (iTOL_scaffolds.txt) to use in iTOL to map literature reports for each alkaloid scaffold (i.e., benzylisoquinoline, aporphine, piperolactam, piperidine, seco-benzylisoquinoline) in each genus covered in the angiorsperm tree of life published by Zuntini et al. 2024 (global_tree_brlen_pruned_renamed.tre file). The resulting tree can be accessed at the following link.

The 05_create_small_tree.py.py script creates a smaller version of the global_tree_brlen_pruned_renamed.tre file by keeping only the orders where at least one alkaloid scaffold was reported. The resulting tree can be accessed at the following link.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
config		config
notebooks		notebooks
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
activate.sh		activate.sh
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Remarkable diversity of alkaloid scaffolds in Piper fimbriulatum

Requirements

Installation and setup

Usage

LC-MS data analysis

SPARQL query

Map SPARQL results onto the Angiosperm tree of life

About

Releases

Packages

Languages

License

pluskal-lab/PiperFIM

Folders and files

Latest commit

History

Repository files navigation

Remarkable diversity of alkaloid scaffolds in Piper fimbriulatum

Requirements

Installation and setup

Usage

LC-MS data analysis

SPARQL query

Map SPARQL results onto the Angiosperm tree of life

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages