Prepare working directory

Prepare Hi-C fastq files

Put all Hi-C fastq files under the working directory. If the purpose is only to phase SNPs, all Hi-C dataset sharing the same haplotype can be pooled together

wk_dir/
|-- HiC_data1_1.fastq.gz
|-- HiC_data1_2.fastq.gz
|-- HiC_data2_1.fastq.gz
|-- HiC_data2_2.fastq.gz
|-- HiC_data3_1.fastq.gz
|-- HiC_data3_2.fastq.gz
...

Prepare configuration file

repo_dir=HaploC-tools
maxIS=1,2,5,10,20,50,100
genome_version=hg19
thread4bwa=30
enzyme=MboI
sizeGb=5
x=30

Make sure that the config file ends with an empty line.

Parameters in configuration file:

Name	Description
repo_dir	The full path to the `HaploC-tools` repository
maxIS	different maximum insert size to be used by HapCUT2
genome_version	Version of the reference genome. Currently supported: `hg19` and `mm10`. When using `mm10`, no population phasing will be conducted in HapCUT2. Other genome will be added soon
thread4bwa	Number of threads for Hi-C read alignment using bwa
enzyme	Cutting enzyme used in the Hi-C experiment
sizeGb	Split Hi-C fastq files into chunks of at most 5Gb size. Each chunk is then processed in parallel for several of the `HaploC` process
x	When calling SNPs, using a subset of Hi-C reads to reach coverage of `x`

A demo working directory (containing Hi-C reads of chr14 and chr18 from the WSU cell line) can be downloaded from zenodo as outlined in the previous section. Follow this guideline to enable command line download (section V).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prepare working directory

Prepare Hi-C fastq files

Prepare configuration file

Parameters in configuration file:

Next steps

HaploC-tools

Modules

General

Clone this wiki locally