-
Notifications
You must be signed in to change notification settings - Fork 0
Prepare working directory
Gian M. Franceschini edited this page Jan 8, 2024
·
7 revisions
Put all Hi-C fastq files under the working directory. If the purpose is only to phase SNPs, all Hi-C dataset sharing the same haplotype can be pooled together
wk_dir/
|-- HiC_data1_1.fastq.gz
|-- HiC_data1_2.fastq.gz
|-- HiC_data2_1.fastq.gz
|-- HiC_data2_2.fastq.gz
|-- HiC_data3_1.fastq.gz
|-- HiC_data3_2.fastq.gz
...
repo_dir=HaploC-tools
maxIS=1,2,5,10,20,50,100
genome_version=hg19
thread4bwa=30
enzyme=MboI
sizeGb=5
x=30
Make sure that the config file ends with an empty line.
Name | Description |
---|---|
repo_dir | The full path to the HaploC-tools repository |
maxIS | different maximum insert size to be used by HapCUT2 |
genome_version | Version of the reference genome. Currently supported: hg19 and mm10 . When using mm10 , no population phasing will be conducted in HapCUT2. Other genome will be added soon |
thread4bwa | Number of threads for Hi-C read alignment using bwa |
enzyme | Cutting enzyme used in the Hi-C experiment |
sizeGb | Split Hi-C fastq files into chunks of at most 5Gb size. Each chunk is then processed in parallel for several of the HaploC process |
x | When calling SNPs, using a subset of Hi-C reads to reach coverage of x
|
A demo working directory (containing Hi-C reads of chr14 and chr18 from the WSU cell line) can be downloaded from zenodo as outlined in the previous section. Follow this guideline to enable command line download (section V).