Prepare version 0.3.0 #22

zjnolen · 2024-01-12T15:50:08Z

No description provided.

Move from fgbio to bamutil to clip overlap, as this tool clips the lower quality bases. Also merge overlapping reads even for modern samples.

Merge overlapping reads in fastp, and map merged reads for both modern and historical. Unmerged reads are mapped for modern as well, and optionally for historical. Downstream overlap clipping switched from fgbio to bamutil (as this accounts for the quality when deciding which read to clip).

…eation

Allow for filtering on depth within depth classes, whole dataset depth, or both. Add ability to choose between a median or percentile based depth cutoff.

Update depth filters Depth filters now produce histograms with limits. Additionally, the depth filter can now be set using multiples of the median (as in 0.2) or alternatively with percentiles.

Convert mappability filter to pileup mappability

This still allows for using mem, either can be selected.

Runtime and threads seem to scale linearly for bwa aln, max it out on threads. bwa samse runs quick, doesn't use much memory, and is single threaded

…nalyses

Also mention some of the reasons why

Set up so workflow can start with bam files and only perform popgen analyses

Switch default aligner for historical samples from bwa mem to aln. bwa mem is still default for modern samples and can be switched to for historical if desired.

Avoids having to run repeatmodeler/masker if it has already been run elsewhere for a genome and a bed/gff file can be provided.

If desired, damageprofiler on user provided bams can be added sometime, but for now its only used when starting from fastq. Assuming that user provided bams have been processed and damage assessed already.

Allow filtering with -minInd in ANGSD

Add in ability to remove transitions easily from config

This will allow easily getting a subsampled bam for a different depth than listed in the config (mainly for extensions to the main workflow, say having some VCF analyses you want to use a higher subsampled depth with than in ANGSD)

Now infers target depth from dp wildcard, allowing mixing of multiple depths in an extended workflow

By default $RANDOM is suggested to choose a random seed each time, but '0' is also commonly used.

Allow removal of individuals from dataset when subsampling

Add ref bias calculation

Allow multiple target depths for depth subsampling

…orMinor

…ed as well

Seems pandas was a dependency for numpy in old versions, but no longer. Needs to be explicitly included now.

zjnolen and others added 30 commits November 14, 2023 17:01

Change overlap clipping method

33b2522

Move from fgbio to bamutil to clip overlap, as this tool clips the lower quality bases. Also merge overlapping reads even for modern samples.

Snakefmt

65f14a1

Allow custom options to be added separately to saf and beagle file cr…

b35864d

…eation

Update depth filtering options

2372564

Allow for filtering on depth within depth classes, whole dataset depth, or both. Add ability to choose between a median or percentile based depth cutoff.

Test trying to get a better automatic bound for the depth histograms

b07230b

Fix missing comma

c6fc730

Fix how xlim on depth hist is calculated

feca5fb

Improve how bins appear in histogram

1031be4

Merge pull request #18 from zjnolen/update-depth-filt

a02544b

Update depth filters Depth filters now produce histograms with limits. Additionally, the depth filter can now be set using multiples of the median (as in 0.2) or alternatively with percentiles.

Convert mappability filter to pileup mappability

a39ee62

Merge pull request #19 from zjnolen/pileup-mappability

296990b

Convert mappability filter to pileup mappability

Switch default aligner for historical samples from bwa mem to aln

4895ebe

This still allows for using mem, either can be selected.

Fix missing flag in bwa mem

9b71530

Try to improve resource allocations for bwa aln

6abb8b1

Runtime and threads seem to scale linearly for bwa aln, max it out on threads. bwa samse runs quick, doesn't use much memory, and is single threaded

Set up so workflow can start with bam files and only perform popgen a…

fd297e2

…nalyses

Snakefmt

06ffabf

Update readme to reflect starting at BAM is possible

612bc89

Also mention some of the reasons why

Update enviroment, see if that helps github actions failing

69e7b11

Update configuration README

378189b

Set dependency of snakemake on pulp <2.8 manually.

676c63d

Make sure all the needed tools are in environment for testing

0773fc6

Add test bam file for bam input

b6f5ce7

Fixing path in user provided qualimap bam output

8fd9edd

Merge pull request #21 from zjnolen/bam-start

5969361

Set up so workflow can start with bam files and only perform popgen analyses

Merge branch 'develop' into bwa-aln

8e5da59

Merge pull request #20 from zjnolen/bwa-aln

5437232

Switch default aligner for historical samples from bwa mem to aln. bwa mem is still default for modern samples and can be switched to for historical if desired.

Fix typo in template config

ebda1df

Allow user input repeat bed files

508e44a

Avoids having to run repeatmodeler/masker if it has already been run elsewhere for a genome and a bed/gff file can be provided.

Fix: Prevent damageprofiler from running when user bam is provided

d20d3ee

If desired, damageprofiler on user provided bams can be added sometime, but for now its only used when starting from fastq. Assuming that user provided bams have been processed and damage assessed already.

zjnolen and others added 29 commits March 29, 2024 18:06

Merge pull request #38 from zjnolen/angsd-minind

b2e0a94

Allow filtering with -minInd in ANGSD

Merge branch 'develop' into rmtrans

ac76eeb

Merge pull request #39 from zjnolen/rmtrans

b136745

Add in ability to remove transitions easily from config

Fix table separator in kinship calcs

4b31c1f

Allow depth to be subsampled on different levels of filtering

4e8e29a

Allow removal of individuals from dataset when subsampling

910fc30

Fix so subsampling can happen to any depth

3f26c02

Fix so subsample proportion isn't locked to config file depth

8dce26a

Now infers target depth from dp wildcard, allowing mixing of multiple depths in an extended workflow

Test making the seed for subsampling always randomly generated

4e7d58f

Make it so subsampling seed is customizable

43a0cee

By default $RANDOM is suggested to choose a random seed each time, but '0' is also commonly used.

Try to fix github actions errors

e45bd43

Add mamba back to requirements

13802a9

Undo actions change

83efc14

Fix snakemake 7 and python 3.12 incompat by forcing 3.11

c8396d2

Merge pull request #40 from zjnolen/subsample-drop-samples

d94f102

Allow removal of individuals from dataset when subsampling

Update documentation

1d9d31d

Fix some formatting

6fa8102

Enable super and subscripts in docs

0edb4d7

Add ref bias calculation

15d3faf

Merge pull request #44 from zjnolen/ibs-ref-bias

921b5fd

Add ref bias calculation

Allow multiple target depths for depth subsampling

70b372f

Merge pull request #46 from zjnolen/multiple_subsample_dp

7cbd583

Allow multiple target depths for depth subsampling

Fix IBS using -noTrans instead of -rmTrans which is correct for doMaj…

81a61cb

…orMinor

Make sure IBS matrix analysis actually removes transitions if request…

d94bfe1

…ed as well

Fix another transition option to be correct for using domajorminor

ff368d0

Fix: Missing pandas dependency for generating popfile

4e93aba

Seems pandas was a dependency for numpy in old versions, but no longer. Needs to be explicitly included now.

I'm dumb and wrote python not pandas in the last commit

dacae83

Make pruning for NGSrelate optional

3c94531

zjnolen merged commit 52a27be into master Sep 18, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prepare version 0.3.0 #22

Prepare version 0.3.0 #22

zjnolen commented Jan 12, 2024

Prepare version 0.3.0 #22

Prepare version 0.3.0 #22

Conversation

zjnolen commented Jan 12, 2024