You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With -so and writing to bam (file or stdout), every read gets a QS tag, including duplicates. But when writing to sam (file or stdout), it looks like reads marked as duplicates are missing the QS tag; instead, an empty field is present, which can cause downstream tools to crash when additional tags are appended (thus the empty field appears to be a "tag" with an improper format).
I put a partial fix for this in the dev branch. It at least trims the trailing tab from the line so that it's syntactically correct. However, it still removes the QS tag.
Unfortunately, fixing it properly will be more work. The SAM code writes SAM lines into string form into a buffer, and then can't extend the string size when it comes to mark duplicates later. The code clips out the QS tag to make extra space for the (possibly) expanded flags field. We'll need to figure out how not to do this ugliness, or else make it more ugly but functional.
Check it out and see if it helps before I put it in master. And let's keep this issue open until it's not clipping out tags.
This seems to have done the trick, thanks. The missing QS tags aren't great but not the end of the world either; the empty fields are gone and it no longer breaks downstream tools when they append additional tags, and that's what I really care about.
With
-so
and writing to bam (file or stdout), every read gets aQS
tag, including duplicates. But when writing to sam (file or stdout), it looks like reads marked as duplicates are missing theQS
tag; instead, an empty field is present, which can cause downstream tools to crash when additional tags are appended (thus the empty field appears to be a "tag" with an improper format).Command:
Output:
The text was updated successfully, but these errors were encountered: