v0.5.0 - XA to XBC
v0.5.0
Please check out the
v0.5.0
Testing Summary Package for a comprehensive report.
Notes
This is a minor release that includes the following changes:
- Detection of all recombinants in Nextclade dataset 2022-09-27:
XA
toXBC
. - Create any number of custom
sc2rf
modes with CLI arguments.
Resources
- Issue #96: Create newick phylogeny of pango lineage parent child relationships, to get accurate sublineages including aliases.
- Issue #118: Fix missing pango-designation issues for XAY and XBA.
Datasets
- Issue #25: Reduce positive controls to one sequence per clade. Add new positive controls
XAL
,XAP
,XAS
,XAU
, andXAZ
. - Issue #92: Reduce negative controls to one sequence per clade. Add negative control for
22D (Omicron) / BA.2.75
. - Issue #155: Add new profile and dataset
controls-gisaid
. Only a list of strains is provided, as GISAID policy prohibits public sharing of sequences and metadata.
Profile Creation
- Issue #77: Report slurm command for
--hpc
profiles inscripts/create_profiles.sh
. - Issue #153: Fix bug where build parameters
metadata
andsequences
were not implemented.
Nextclade
sc2rf
-
Issue #78: Add new parameter
max_breakpoint_len
tosc2rf_recombinants
to mark samples with two much uncertainty in the breakpoint interval as false positives. -
Issue #79: Add new parameter
min_consec_allele
tosc2rf_recombinants
to ignore recombinant regions with less than this number of consecutive alleles (both diagnostic SNPs and diganostic reference alleles). -
Issue #80: Migrate sc2rf froma submodule to a subdirectory (including LICENSE!). This is to simplify the updating process and avoid errors where submodules became out of sync with the main pipeline.
-
Issue #83: Improve error handling in
sc2rf_recombinants
when the input stats files are empty. -
Issue #89: Reduce the default value of the parameter
min_len
insc2rf_recombinants
from1000
to500
.This is to handleXAP
andXAJ
. -
Issue #90: Auto-pass select nextclade lineages through
sc2rf
:XN
,XP
,XAR
,XAS
, andXAZ
. This requires differentiating the nextclade inputs as separate parameters--nextclade
and--nextclade-no-recom
.-
XN
,XP
, andXAR
have extremely small recombinant regions at the terminal ends of the genome. Depending on sequencing coverage,sc2rf
may not reliably detect these lineages. -
The newly designated
XAS
andXAZ
pose a challenge for recombinant detection using diagnostic alleles. The first region ofXAS
could be eitherBA.5
orBA.4
based on subsitutions, but is mostly likelyBA.5
based on deletions. Since the region contains no diagnostic alleles to discriminateBA.5
vs.BA.4
, breakpoints cannot be detected bysc2rf
. -
Similarly for
XAZ
, theBA.2
segments do not contain anyBA.2
diagnostic alleles, but instead are all reversion fromBA.5
alleles. TheBA.2
parent was discovered by deep, manual investigation in the corresponding pango-designation issue. Since theBA.2
regions contain no diagnostic forBA.2
, breakpoints cannot be detected bysc2rf
.
-
-
Issue #95: Generalize
sc2rf_recombinants
to take any number of ansi and csv input files. This allows greater flexibility in command-line arguments tosc2rf
and are not locked into the hardcodedprimary
andsecondary
parameter sets. -
Issue #96: Include sub-lineage proportions in the
parents_lineage_confidence
. This reduces underestimating the confidence of a parental lineage. -
Issue #150: Fix bug where
sc2rf
would write empty output csvfiles if no recombinants were found. -
Issue #151: Fix bug where samples that failed to align were missing from the linelists.
-
Issue #158: Reduce
sc2rf
param--max-intermission-length
from3
to2
to be consistent with Issue #79. -
Issue #161: Implement selection method to pick best results from various
sc2rf
modes. -
Issue #162: Upgrade
sc2rf/virus_properties.json
. -
Issue #163: Use LAPIS
nextcladePangoLineage
instead ofpangoLineage
. Also disable default filtermax_breakpoint_len
forXAN
. -
Issue #164: Fix bug where false positives would appear in the filter
sc2rf
ansi output (recombinants.ansi.txt
). -
The optional
lapis
parameter forsc2rf_recombinants
has been removed. Querying LAPIS for parental lineages is no longer experimental and is now an essential component (cannot be disabled). -
The mandatory
mutation_threshold
parameter forsc2rf
has been removed. Instead,--mutation-threshold
can be set independently in each of thescrf
modes.
Linelist
- Issue #157: Create new parameters
min_lineage_size
andmin_private_muts
to control lineage splitting intoX*-like
.
Plot
- Issue #17: Create script to plot lineage assignment changes between versions using a Sankey diagram.
- Issue #82: Change epiweek start from Monday to Sunday.
- Issue #111: Fix breakpoint distribution axis that was empty for clade.
- Issue #152: Fix file saving bug when largest lineage has
/
characters.
Report
- Issue #88: Add pipeline and nextclade versions to powerpoint slides as footer. This required adding
--summary
as param toreport
.
Validate
- Issue #56: Change rule
validate
from simply counting the number of positives to validating the fieldslineage
,breakpoints
,parents_clade
. This involves adding a new default parameterexpected
for rulevalidate
indefaults/parameters.yaml
.
Designated Lineages
- Issue #149:
XA
- Issue #148:
XB
- Issue #147:
XC
- Issue #146:
XD
- Issue #145:
XE
- Issue #144:
XF
- Issue #143:
XG
- Issue #141:
XH
- Issue #142:
XJ
- Issue #140:
XK
- Issue #139:
XL
- Issue #138:
XM
- Issue #137:
XN
- Issue #136:
XP
- Issue #135:
XQ
- Issue #134:
XR
- Issue #133:
XS
- Issue #132:
XT
- Issue #131:
XU
- Issue #130:
XV
- Issue #129:
XW
- Issue #128:
XY
- Issue #127:
XZ
- Issue #126:
XAA
- Issue #125:
XAB
- Issue #124:
XAC
- Issue #123:
XAD
- Issue #122:
XAE
- Issue #120:
XAF
- Issue #121:
XAG
- Issue #119:
XAH
- Issue #117:
XAJ
- Issue #116:
XAK
- Issue #115:
XAL
- Issue #110:
XAM
- Issue #109:
XAN
- Issue #108:
XAP
- Issue #107:
XAQ
- Issue #87:
XAS
- Issue #105:
XAT
- Issue #103:
XAU
- Issue #104:
XAV
- Issue #105:
XAW
- Issue #85:
XAY
- Issue #87:
XAZ
- Issue #94:
XBA
- Issue #114:
XBB
- Issue #160:
XBC
Proposed Lineages
- Issue #99:
proposed808
Commits
b48ad6d7
docs: fix CHANGELOG pr04b17918
docs: update readme and changelog72dd5a8f
docs: add testing summary package for v0.4.2 to v0.5.0558f7d79
resources: fix breakpoints for XAE #12291e5843b
script: bugfix sc2rf ansi output for #1649bc13639
docs: update issues and validation table orderb63520e5
script: implement lineage check in dups for #117 #161901898da
sc2rf updates for #158 #161 #162 #16396fa6af1
dataset: update controls-gisaid strain list and validation84466a10
workflow: new param dup_method for #1619ca0c71e
script: implement duplicate reconciliation for #161112ea684
param: upgrade nextclade dataset for #159859b92c8
script: add more detail to validate table for failing samples5e285912
script: add param --min-link-size to compare_positivesbd01a5e4
workflow: added failed validate output to rule log8e5b90fb
workflow: don't use metadata for sc2rf_recombinants when exclude_negatives is truecdf45407
param: add new params min-lineage-size and min-private-muts for #157bc04fddf
workflow: update validation strains for #1556aa95221
param: fix typo of missing --mutation-threshold25df848c
param: remove param mutation_threshold as universal param for sc2rf- See CHANGELOG.md for additional commits.