detect FLT3-internal tandem duplication using Pindel
perl /path/FLT3_ITD_Pindel/ -bam <bam file> -n <name> -flen <fragment length> -fa <fasta> -pindel <path to binary pindel> -od <outdir>
: bam file
: sample name
: DNA fragment length. default: 500
: fasta file
: binary pindel
: output dir
cd /path/FLT3_ITD_Pindel/test/res
sh >log 2>&1 &
-rw-r--r-- 1 lffu staff 24K 9 18 14:55 log
-rw-r--r-- 1 lffu staff 85B 9 18 14:55 pinde.cfg
-rw-r--r-- 1 lffu staff 66B 9 18 14:55
-rw-r--r-- 1 lffu staff 1.1K 9 18 14:55 test.ins.vcf
-rw-r--r-- 1 lffu staff 1.0K 9 18 14:55
-rw-r--r-- 1 lffu staff 1.7K 9 18 14:55
-rw-r--r-- 1 lffu staff 1.8K 9 18 14:55
-rw-r--r-- 1 lffu staff 0B 9 18 14:55 test_BP
-rw-r--r-- 1 lffu staff 0B 9 18 14:55 test_CloseEndMapped
-rw-r--r-- 1 lffu staff 0B 9 18 14:55 test_D
-rw-r--r-- 1 lffu staff 0B 9 18 14:55 test_INT_final
-rw-r--r-- 1 lffu staff 3.8K 9 18 14:55 test_INV
-rw-r--r-- 1 lffu staff 0B 9 18 14:55 test_LI
-rw-r--r-- 1 lffu staff 0B 9 18 14:55 test_RP
-rw-r--r-- 1 lffu staff 0B 9 18 14:55 test_SI
-rw-r--r-- 1 lffu staff 76K 9 18 14:55 test_TD
short tandem dup will be stored in _SI file
long tandem dup will be stored in _TD file
: short tandem dup vcf file
: long tandem dup vcf file
: final tandem dup vcf file
- HOMLEN > 0
- ref POS column need to be included by the positions of alt seq
- alt num >= 2
- SVLEN >= 2
step2 details
#1. get ins seq from ALT column in *.ins.vcf
#2. find all positions of ins seq in ref (allow one mismatch)
#3. check if POS column in *.ins.vcf
is included by the finded postions in #2
#4. if the POS column is included, then this variant will output in the final result file *
- no filter
conda install -c bioconda pindel
Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads, Bioinfomatics, 2009