SEACR input and spike-in normalization #77

jordan841220 · 2023-09-08T01:12:10Z

Thank you for your contribution in this amazing tool. My question might be naive but I couldn't find any explanation in the paper or github (or maybe I simply ignored it).

From the peak calling section in the pipeline as described below, I found that the input bedgraph file of SEACR was inherently the intermediate ouput of macs2 callpeak. I was curious why this CUT&RUN pipeline was designed like this, instead of converting the original BAM/BED file to bedgraph file using bedtools genomecov, and use it as the input of SEACR (I have tried and the results were very different), which was wrote in the SEACR tutorial.

# macs2 narrow peak calling
>&2 echo "[info] macs2 narrow peak calling"
$macs2bin/macs2 callpeak -t $bam_file -g $macs2_genome -f BAMPE -n $base_file --outdir $outdir -q 0.01 -B --SPMR --keep-dup all 2> $logdir/"$base_file".macs2.narrow

# macs2 broad peak calling
>&2 echo "[info] macs2 broad peak calling"
$macs2bin/macs2 callpeak -t $bam_file -g $macs2_genome -f BAMPE -n $base_file --outdir $outdirbroad --broad --broad-cutoff 0.1 -B --SPMR --keep-dup all 2> $logdir/"$base_file".macs2.broad

# macs2 broad peak calling
>&2 echo "[info] Getting broad peak summits"
$pythonbin/python $extratoolsbin/get_summits_broadPeak.py $outdirbroad/"$base_file"_peaks.broadPeak | $bedopsbin/sort-bed - > $outdirbroad/"$base_file"_summits.bed

# SEACR peak calls
>&2 echo "[info] SEACR stringent peak calling"
$macs2bin/macs2 callpeak -t $bam_file -g $macs2_genome -f BAMPE -n $base_file --outdir $outdirseac -q 0.01 -B --SPMR --keep-dup all 2> $logdir/"$base_file".seacr
$pythonbin/python $extratoolsbin/change.bdg.py $outdirseac/"$base_file"_treat_pileup.bdg > $outdirseac/"$base_file"_treat_integer.bdg
chmod +x $extratoolsbin/SEACR_1.1.sh
$extratoolsbin/SEACR_1.1.sh $outdirseac/"$base_file"_treat_integer.bdg 0.01 non stringent $outdirseac/"$base_file"_treat $Rscriptbin
$bedopsbin/sort-bed $outdirseac/"$base_file"_treat.stringent.bed > $outdirseac/"$base_file"_treat.stringent.sort.bed
$pythonbin/python $extratoolsbin/get_summits_seacr.py $outdirseac/"$base_file"_treat.stringent.bed|$bedopsbin/sort-bed - > $outdirseac/"$base_file"_treat.stringent.sort.summits.bed
for i in _summits.bed _peaks.xls _peaks.narrowPeak _control_lambda.bdg _treat_pileup.bdg; do 
    rm -rf $outdirseac/"$base_file"$i
done

The pipeline conducted the spike_in normalization using bamCoverage after peak calling. Isn't it make sense to use the spike_in-normalized bedgraph file to perform peak calling?

Thanks!!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SEACR input and spike-in normalization #77

SEACR input and spike-in normalization #77

jordan841220 commented Sep 8, 2023 •

edited

Loading

SEACR input and spike-in normalization #77

SEACR input and spike-in normalization #77

Comments

jordan841220 commented Sep 8, 2023 • edited Loading

jordan841220 commented Sep 8, 2023 •

edited

Loading