Skip to content

Commit

Permalink
Set up vignette and wrote examples for DGE_analysis
Browse files Browse the repository at this point in the history
  • Loading branch information
Yunnnning committed Feb 6, 2025
1 parent 0941f88 commit 9888054
Show file tree
Hide file tree
Showing 5 changed files with 57 additions and 80 deletions.
Binary file modified .DS_Store
Binary file not shown.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,5 @@ inst/markdown/test.Rmd
codecov.yml
!.github/codecov.yml
docs/
/doc/
/Meta/
Binary file added vignettes/data/sce.qs
Binary file not shown.
68 changes: 55 additions & 13 deletions vignettes/getting_started.Rmd
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
---
title: "Getting Started"
title: "Get Started"
author: "<h4>Authors: <i>Salman Fawad, Yunning Yuan, Alan Murphy, Nathan Skene</i></h4>"
date: "<h4>Vignette updated: <i>`r format( Sys.Date(), '%b-%d-%Y')`</i></h4>"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{getting_started}
Expand All @@ -13,21 +15,61 @@ knitr::opts_chunk$set(
comment = "#>"
)
```
***
# Introduction

The *poweranalysis* R package is designed to run robust power analysis for differential gene expression in scRNA-seq studies, addressing the challenge of low statistical power due to small sample sizes. It provides tools to estimate the optimal number of samples and cells needed to achieve reliable power levels.
<br>
<br>
Using *poweranalysis* involves four steps: <br>
**1. Differential Expression (DE) Analysis:** Import your SCE object and perform DE analysis using a pseudobulking approach, allowing for the robust identification of differentially expressed genes (DEGs) across conditions or groups. <br>
**2. Correlation Analysis:** Assess the consistency of DEG effect sizes across random permutations, subsets of your data, and different studies by computing correlation matrices. <br>
**3. Downsampling and Power Analysis:** Test the robustness of your findings by downsampling both individuals and cells, then analyze the effect on DEG detection power. <br>
**4. Visualisation:** Generate comprehensive power plots and visual summaries of the power analysis results, including DEG detection rates, false discovery rates, and mean LogFC correlation between down-sampled subsets.
<br>
```{r, echo=FALSE, include=FALSE}
pkg <- read.dcf("../DESCRIPTION", fields = "Package")[1]
library(pkg, character.only = TRUE)
```

# Setup
The *poweranalysis* R package is designed to run robust power analysis for differential gene expression in scRNA-seq studies and provides tools to estimate the optimal number of samples and cells needed to achieve reliable power levels.

## Setup
```{r setup}
library(poweranalysis)
```

## Differential Gene Expression (DGE) Analysis
Import an SCE object and perform differential expression analysis using a pseudobulking approach, enabling the robust identification of differentially expressed genes (DEGs) across conditions or groups from single-cell data. <br>
**Example Usage**: <br>
To run the DGE analysis, first load your own SingleCellExperiment (SCE) object.

```{r, message=FALSE, warning=FALSE}
# Load your SCE object (replace with actual file path)
library(qs)
library(SingleCellExperiment)
SCE <- qs::qread("./data/sce.qs")
```

To use the `DGE_analysis` function, specify the formula for comparison along with the pseudobulk ID and celltype ID. The function requires the following key arguments:

· **`design`**: A formula that defines the variables included in the model for differential expression analysis. This determines how gene expression is compared across different groups.

· **`coef`**: A character string that specifies which group in the `design` formula you want to investigate for differential expression. <br>

For example, to validate the differential expression (DEG) analysis approach, you can run a comparison between sexes using the <i>formula = ~ sex. </i>. This will assess how gene expression differs between male and female groups.

```{r}
# Run the DGE_analysis function for a sex comparison
DGE_analysis.sex <- DGE_analysis(
SCE,
design = ~ sex,
pseudobulk_ID = "sample_id",
celltype_ID = "cluster_celltype",
coef = "M"
)
```

If you want to compare disease and control conditions, specify the disease status in the formula and the disease group of interest in the coef.

```{r}
# Run the DGE_analysis function for a disease vs. control comparison
DGE_analysis.AD_sex <- DGE_analysis(
SCE,
design = ~ pathological_diagnosis,
pseudobulk_ID = "sample_id",
celltype_ID = "cluster_celltype",
coef = "AD"
)
```

67 changes: 0 additions & 67 deletions vignettes/poweranalysis.Rmd

This file was deleted.

0 comments on commit 9888054

Please sign in to comment.