Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable subsampling to lower depth #26

Merged
merged 1 commit into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 24 additions & 4 deletions .test/config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,6 @@ exclude_ind: []

excl_pca-admix: []

#==================== Downsampling Configuration ======================#

downsample_cov:

#====================== Analysis Selection ============================#

populations: []
Expand Down Expand Up @@ -68,6 +64,29 @@ analyses:
inbreeding_ngsf-hmm: true
ibs_matrix: true

#==================== Downsampling Configuration ======================#

subsample_dp: 2

subsample_redo_filts: true

subsample_analyses:
estimate_ld: true
ld_decay: true
pca_pcangsd: true
admix_ngsadmix: true
relatedness:
ngsrelate: true
ibsrelate_ibs: true
ibsrelate_sfs: true
thetas_angsd: true
heterozygosity_angsd: true
fst_angsd:
populations: true
individuals: true
inbreeding_ngsf-hmm: true
ibs_matrix: true

#=========================== Filter Sets ==============================#

filter_beds:
Expand Down Expand Up @@ -109,6 +128,7 @@ params:
extra_beagle: ""
snp_pval: "1e-6"
min_maf: 0.05
mindepthind_heterozygosity: 3
ngsld:
max_kb_dist_est-ld: 200
max_kb_dist_decay: 100
Expand Down
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,12 @@ Additionally, several data filtering options are available:
- Removal of regions with low mappability for fragments of a specified size
- Removal of regions with extreme high or low depth
- Removal of regions with a certain amount of missing data
- Multiple filter sets from user provided BED files that can be intersected
with other enabled filters (for instance, performing analyses on neutral
sites and genic regions separately)

All the above analyses can also be performed with sample depth subsampled to
a uniform level to account for differences in depth between samples.

## Getting Started

Expand Down
24 changes: 24 additions & 0 deletions config/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -282,6 +282,30 @@ settings for each analysis are set in the next section.
- `ibs_matrix:` Estimate pairwise identity by state distance between all
samples using ANGSD. (`true`/`false`)

#### Downsampling Section

As this workflow is aimed at low coverage samples, its likely there might be
considerable variance in sample depth. For this reason, it may be good to
subsample all your samples to a similar depth to examine if variation in depth
is influencing results. To do this, set an integer value here to subsample all
your samples down to and run specific analyses.

- `subsample_dp:` A mean depth to subsample your reads to. This will be done
per sample, and subsample from all the reads. If a sample already has the
same, or lower, depth than this number, it will just be used as is in the
analysis. (INT)

- `subsample_redo_filts:` Make a separate filtered sites file using the
subsampled bams to calculate depth based filters. If left disabled, the
depth filters will be determined from the full coverage files.
(`true`/`false`)

- `subsample_analyses:` Individually enable analyses to be performed with the
subsampled data. These are the same as the ones above in the analyses
section. Enabling here will only run the analysis for the subsampled data,
if you want to run it for the full data as well, you need to enable it in the
analyses section as well. (`true`/`false`)

#### Filter Sets

By default, this workflow will perform all analyses requested in the above
Expand Down
30 changes: 24 additions & 6 deletions config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,6 @@ exclude_ind: []

excl_pca-admix: []

#==================== Downsampling Configuration ======================#

# untested, not recommended for now

downsample_cov:

#====================== Analysis Selection ============================#

populations: []
Expand Down Expand Up @@ -70,6 +64,29 @@ analyses:
inbreeding_ngsf-hmm: false
ibs_matrix: false

#==================== Downsampling Configuration ======================#

subsample_dp:

subsample_redo_filts:

subsample_analyses:
estimate_ld: false
ld_decay: false
pca_pcangsd: false
admix_ngsadmix: false
relatedness:
ngsrelate: false
ibsrelate_ibs: false
ibsrelate_sfs: false
thetas_angsd: false
heterozygosity_angsd: false
fst_angsd:
populations: false
individuals: false
inbreeding_ngsf-hmm: false
ibs_matrix: false

#=========================== Filter Sets ==============================#

filter_beds:
Expand Down Expand Up @@ -109,6 +126,7 @@ params:
extra_beagle: ""
snp_pval: "1e-6"
min_maf: 0.05
mindepthind_heterozygosity: 3
ngsld:
max_kb_dist_est-ld: 4000
max_kb_dist_decay: 100
Expand Down
Loading
Loading