Skip to content

Commit

Permalink
#2 compatible with different assemblies
Browse files Browse the repository at this point in the history
  • Loading branch information
Madalina Giurgiu committed Jan 20, 2023
1 parent a516104 commit 15b0bef
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 6 deletions.
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,14 @@ Developed by Richard Koche, adapted to BIH cluster by @haasek and maintained by
- [Installation](#installation)
- [Usage](#usage)
- [Run circle-enrich-filter example](#run-circle-enrich-filter)
- [Run circle-enrich-filter with PDX data](#run-pdx)
- [Citation](#citation)
- [License](#license)

## Installation <a name="installation"></a>

Prequisites are conda and pgltools v2.2.0. Compatible with `python 2.7` and `python 3.7`.
Prequisites are conda and pgltools v2.2.0. Compatible with `python 2.7` and `python 3.7`.<br/>
Compatible with human (`GRCh38.p13`, `GRCh37.p13`, `hs37d5`, `hg38`, `hg19`) and mouse (`GRCm38`) genome assemblies.

### 1. Create conda environment

Expand Down Expand Up @@ -90,7 +92,7 @@ Options:

## Run circle-enrich-filter example <a name="run-circle-enrich-filter"></a>

Run test example using the `2:14508020-18508849` region from a CHP212 cellline sequenced using the Circle-seq Illumina sequencing.
Run test example using the `2:14508020-18508849` region from CHP212 human cellline sequenced using the Circle-seq Illumina sequencing.

```
bash run_CircleEnrichFilter.sh -i example/output/chp212_2_14508020_18508849.bam -o example/2_14508020_18508849.fa
Expand All @@ -107,6 +109,11 @@ The pipeline generates the all files under `example/ouput`. The final enriched r
| 5 | all counts (coverage all reads spanning the junction) |


## Run PDX samples with circle-enrich-filter <a name="run-pdx"></a>

In case you run PDX samples, please consider annotating all chromosomes from the mouse using `m.*`, .e.g `m.1` or `m.chr1`.


## Citation <a name="citation"></a>

Koche, R.P., Rodriguez-Fos, E., Helmsauer, K. et al. Extrachromosomal circular DNA drives oncogenic genome remodeling in neuroblastoma. Nat Genet 52, 29-34 (2020).
Expand Down
8 changes: 4 additions & 4 deletions run_CircleEnrichFilter.sh
Original file line number Diff line number Diff line change
Expand Up @@ -126,14 +126,14 @@ grep -v "^#" $peakfile | cut -f2-4,6 > $peakbed
# Output merged regions of enrichment
# Edges may not be perfect due to where enrichment block falls, but works in most cases and can be corrected below
mergedbed=${peakfile/%.txt/.enriched.merged.orig.bed}
grep -v "^#" $peakfile | cut -f2-4 | grep -v "random\|chrUn\|GL\|NC_\|hs37d5\|EGFP\|KI" | sort -k1,1 -k2,2n | bedtools merge -d $mergedist -i stdin > $mergedbed
grep -v "^#" $peakfile | cut -f2-4 | grep -v "alt\|random\|chrU\|GL\|NC\|hs37d5\|EGFP\|K\|ML\|JH" | sort -k1,1 -k2,2n | bedtools merge -d $mergedist -i stdin > $mergedbed


# Output merged reads
# (sometimes more accurate delineation of circle junctions, but can fall into trap of cascading reads outside of enriched region)
# First get bam to bed
bam2bed=${inbamnodir/%.bam/.bam2bed.bed}
samtools view --threads $nthreads -bq $qfilt $inbam | bedtools bamtobed -i stdin -splitD | cut -f1-3 | grep -v "random\|chrUn\|GL\|NC_\|hs37d5\|EGFP\|KI" | sort -k1,1 -k2,2n | bedtools merge -d 250 -i stdin > $bam2bed
samtools view --threads $nthreads -bq $qfilt $inbam | bedtools bamtobed -i stdin -splitD | cut -f1-3 | grep -v "alt\|random\|chrU\|GL\|NC\|hs37d5\|EGFP\|K\|ML\|JH" | sort -k1,1 -k2,2n | bedtools merge -d 250 -i stdin > $bam2bed

# Then get bam2bed overlap with enriched blocks, but only keep segments with >5 reads
# (this is for edge fine-tuning, not circle calling)
Expand Down Expand Up @@ -322,8 +322,8 @@ bamCoverage --extendReads 0 --minMappingQuality $qfilt --ignoreDuplicates --binS
## First, chrM plots:
# get chrM coords for this genome assembly, based on bam header (differs depending on used genome build):
chrMcoord="chrM.bed"
#samtools view -H $inbam | grep "SN:chrM" | awk -v OFS='\t' '{ split($3,a,":"); print "chrM",1,a[2]}' > $chrMcoord
samtools view -H $inbam | grep -P "SN:(chr)*MT" | awk -v OFS='\t' '{split($3,a,":"); print "MT",1,a[2]}' > $chrMcoord
samtools view -H $inbam | grep "SN:chrM" | awk -v OFS='\t' '{ split($3,a,":"); print "chrM",1,a[2]}' > $chrMcoord
samtools view -H $inbam | grep "SN:MT" | awk -v OFS='\t' '{split($3,a,":"); print "MT",1,a[2]}' >> $chrMcoord

# meta plots
# create data matrix (one site, so here a simple vector)
Expand Down

0 comments on commit 15b0bef

Please sign in to comment.