Skip to content

Commit

Permalink
Fix: Only use first white-space separated entry in contig name
Browse files Browse the repository at this point in the history
Fixes issues for genomes where extra info may be added in the contig name after the name itself (separated by a space or a tab).
  • Loading branch information
zjnolen committed Jan 20, 2025
1 parent e1a2267 commit b0ac754
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion .test/data/ref/testref.fa

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion workflow/rules/common.smk
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ def chunkify(reference_fasta, chunk_size):
with open(reference_fasta, "r") as fasta_in:
for header, seq in itertools.groupby(fasta_in, lambda x: x.startswith(">")):
if header:
contig = next(seq).strip(">").strip()
contig = next(seq).strip(">").strip().split(maxsplit=1)[0]
seq_length = len("".join(seq).replace("\n", ""))
if seq_length > 0:
contigs = contigs + [[contig, seq_length]]
Expand Down

0 comments on commit b0ac754

Please sign in to comment.