contrastiveVI reproducibility repository

This repository contains code for reproducing results in the contrastiveVI paper.

contrastiveVI is a generative model designed to isolate factors of variation specific to a group of "target" cells (e.g. from specimens with a given disease) from those shared with a group of "background" cells (e.g. from healthy specimens). contrastiveVI is implemented in scvi-tools.

A standalone contrastiveVI package, along with Google Colab notebooks demonstrating its use on various scRNA-seq datasets, can be found at https://github.com/suinleelab/contrastiveVI.

What you can do with contrastiveVI

If you have a dataset with cells in a background condition (e.g. from healthy controls) and a target condition (e.g. from diseased patients), you can train contrastiveVI to isolate latent factors of variation specific to the target cells from those shared with a background into separate latent spaces.
Run clustering algorithms on the target-specific latent space to discover sub-groups of target cells
Perform differential expression testing for discovered sub-groups of target cells using a procedure similar to that of scVI .

System requirements

This software was designed and tested on a machine running CentOS 7.8.2003, with Python 3.9, PyTorch 1.11, and CUDA 11.2. For a full list of all external Python package dependences used in this project, see the Conda environment file environment.yml and the package setup file setup.cfg.

When available, this software leverages graphical processing units (GPUs) to accelerate neural network training and evaluation. In our experiments we found that models trained with the aid of a GPU converged in less than 30 minutes (and usually much sooner), with the exact time depending on the size of a given dataset. Systems lacking suitable GPUs may take an extremely long time to train/evaluate models. Our experiments were conducted using an NVIDIA RTX 2080 TI GPU; other GPUs should also work as long as they have sufficient memory (~2GB).

Reproducibility guide

Installing the software

Git clone this repository.
cd contrastive-vi.

Create and activate the specified conda environment by running

conda env create -f environment.yml
conda activate contrastive-vi-env

Install the local constrative_vi package and necessary dependencies by running pip install -e ".[dev]".

This installation procedure should take no more than 30 minutes. Beyond downloading this repository, no addition time is required for its installation.

Modify data and result path

Modify the paths for storing data and model results in scripts/constants.py.

Modify DEFAULT_DATA_PATH for the data storage path.
Modify DEFAULT_RESULTS_PATH for the model result storage path.

Download datasets

Run scripts/preprocess_data.py to download and preprocess the single-cell datasets. For example, to download the data from Zheng et al. 2017

python scripts/preprocess_data.py zheng_2017

Run experiments

Run scripts/run_experiment.py to train models and store model outputs. For example, to train contrastiveVI models with the data from Zheng et al. 2017,

python scripts/run_experiment.py zheng_2017 contrastiveVI

Evaluate model performance

Once the experiments for the baselines (CPLVM, CGLVM, and scVI) and contrastiveVI have all been completed. Run

python scripts/evaluate_performance.py

to generate performance_summary.csv in the result directory.

Note: In scripts/evaluate_performance.py, the variable datasets can be modified for evaluating model performance with particular datasets of interest.

Plot results

Figures can be plotted using the corresponding notebooks in notebooks/figures.

References

If you find contrastiveVI useful for your work, please consider citing our preprent:

@article{contrastiveVI,
  title={Isolating salient variations of interest in single-cell transcriptomic data with contrastiveVI},
  author={Weinberger, Ethan and Lin, Chris and Lee, Su-In},
  journal={bioRxiv},
  year={2021},
  publisher={Cold Spring Harbor Laboratory}
}

Name		Name	Last commit message	Last commit date
Latest commit History 240 Commits
contrastive_vi		contrastive_vi
notebooks		notebooks
scripts		scripts
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
conftest.py		conftest.py
environment.yml		environment.yml
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py
sketch.png		sketch.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

contrastiveVI reproducibility repository

What you can do with contrastiveVI

System requirements

Reproducibility guide

Installing the software

Modify data and result path

Download datasets

Run experiments

Evaluate model performance

Plot results

References

About

Releases

Packages

Contributors 2

Languages

License

suinleelab/contrastiveVI-reproducibility

Folders and files

Latest commit

History

Repository files navigation

contrastiveVI reproducibility repository

What you can do with contrastiveVI

System requirements

Reproducibility guide

Installing the software

Modify data and result path

Download datasets

Run experiments

Evaluate model performance

Plot results

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages