Can we "artificially extend UMIs" and correct for primers mis-binding #1881
-
Hello, I've some TCR-seq data generated through SEQTR protocol (https://www.sciencedirect.com/science/article/pii/S2667237523000784) and I'd like to use MiXCR to analyse this data. I could define some custom preset for it and it works quite well. But there are few subleties related to this protocol and I'd like to know if there are any way to deal with these:
Thank you very much for your help! Best regards, Julien |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Good thinking! You can do this to get what you need:
This way, you will have two UMIs, where the first one is excluded from the alignment and the second one remains. As for the issue with primer misannealing, it’s not really something you can fully prevent. The reality is that some primers anneal to the wrong genes and you actually do have such reads. The clone itself is perfectly valid though. Usually, you would exclude the primer sequence from the alignment, but you mentioned you want to keep them. The question then becomes: what do you want to do with reads where the first 20 nt have 2 mismatches from the primer misannealing, but the rest of the sequence is fine? If you have a sufficiently long region after the primer, the V gene should still be identified correctly. You can also try excluding this segment from the assembling feature (though not by sequence, only by location relative to the anchor points) if you simply don’t want that part of the sequence to be used in assembly and be present in the output. |
Beta Was this translation helpful? Give feedback.
Good thinking! You can do this to get what you need:
^(UMI1:N{9})(R1:(UMI2:N{10})*)\R2:8)
This way, you will have two UMIs, where the first one is excluded from the alignment and the second one remains.
As for the issue with primer misannealing, it’s not really something you can fully prevent. The reality is that some primers anneal to the wrong genes and you actually do have such reads. The clone itself is perfectly valid though. Usually, you would exclude the primer sequence from the alignment, but you mentioned you want to keep them. The question then becomes: what do you want to do with reads where the first 20 nt have 2 mismatches from the primer misannealing, but the rest of the seq…