Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with cigar in getNiceAlignment #1

Open
francoisfauteux opened this issue Jul 22, 2023 · 1 comment
Open

Problem with cigar in getNiceAlignment #1

francoisfauteux opened this issue Jul 22, 2023 · 1 comment
Labels
help wanted Extra attention is needed

Comments

@francoisfauteux
Copy link

francoisfauteux commented Jul 22, 2023

Hello, the expression under "check CIGAR correct" in getNiceAlignment "^[XDI1-9=]*$" does not account for all possible cigar ops according to https://samtools.github.io/hts-specs/SAMv1.pdf (1.4.6. CIGAR: CIGAR string) it could be "^[MIDNSHPX0-9=]+$". First test below works, second example does not. Maybe this can also help: https://rdrr.io/bioc/GenomicAlignments/man/cigar-utils.html. Thank you.

library(edlibR)

qry<-"this is a test"
trg<-"test this is a"
res=align(qry,trg,task="path")
res$cigar
#[1] "1=5D8=5I"
getNiceAlignment(res,qry,trg)
#$query_aligned
#[1] "t-----his is a test"
#$matched_aligned
#[1] "|-----||||||||-----"
#$target_aligned
#[1] "test this is a-----"

qry=paste(rep("this is a test",100),collapse=" ")
trg=paste(rep("test this is a",100),collapse=" ")
res=align(qry,trg,task="path")
res$cigar
#[1] "1=5D1493=5I"
aln=getNiceAlignment(res,qry,trg)
#Error in getNiceAlignment(res, qry, trg) :
#The CIGAR string is in an invalid format. Operation detected is not a single character '=' or 'X' or 'D'.
#The expected format should be 'number of occurrences' + 'CIGAR operation', e.g. '4=' or '1D5=1X1=1X'.Please fix.

@evanbiederstedt
Copy link
Owner

evanbiederstedt commented Jul 22, 2023

Hi @francoisfauteux

Thanks for the issue.

I was simply was using the same CIGAR parsing operations within the python library: https://github.com/Martinsos/edlib

Please create a pull request to incorporate other CIGAR parsing operations if you wish; it could be incorporated in both libraries.

@Martinsos may have more thoughts here.

Best, Evan

@evanbiederstedt evanbiederstedt added the help wanted Extra attention is needed label Jul 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants