-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepare version 0.3.0 #22
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Move from fgbio to bamutil to clip overlap, as this tool clips the lower quality bases. Also merge overlapping reads even for modern samples.
Merge overlapping reads in fastp, and map merged reads for both modern and historical. Unmerged reads are mapped for modern as well, and optionally for historical. Downstream overlap clipping switched from fgbio to bamutil (as this accounts for the quality when deciding which read to clip).
Allow for filtering on depth within depth classes, whole dataset depth, or both. Add ability to choose between a median or percentile based depth cutoff.
Update depth filters Depth filters now produce histograms with limits. Additionally, the depth filter can now be set using multiples of the median (as in 0.2) or alternatively with percentiles.
Convert mappability filter to pileup mappability
This still allows for using mem, either can be selected.
Runtime and threads seem to scale linearly for bwa aln, max it out on threads. bwa samse runs quick, doesn't use much memory, and is single threaded
Also mention some of the reasons why
Set up so workflow can start with bam files and only perform popgen analyses
Switch default aligner for historical samples from bwa mem to aln. bwa mem is still default for modern samples and can be switched to for historical if desired.
Avoids having to run repeatmodeler/masker if it has already been run elsewhere for a genome and a bed/gff file can be provided.
If desired, damageprofiler on user provided bams can be added sometime, but for now its only used when starting from fastq. Assuming that user provided bams have been processed and damage assessed already.
Allow filtering with -minInd in ANGSD
Add in ability to remove transitions easily from config
This will allow easily getting a subsampled bam for a different depth than listed in the config (mainly for extensions to the main workflow, say having some VCF analyses you want to use a higher subsampled depth with than in ANGSD)
Now infers target depth from dp wildcard, allowing mixing of multiple depths in an extended workflow
By default $RANDOM is suggested to choose a random seed each time, but '0' is also commonly used.
Allow removal of individuals from dataset when subsampling
Add ref bias calculation
Allow multiple target depths for depth subsampling
Seems pandas was a dependency for numpy in old versions, but no longer. Needs to be explicitly included now.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.