-
Notifications
You must be signed in to change notification settings - Fork 0
Installation
Gian M. Franceschini edited this page Jan 8, 2024
·
9 revisions
- Clone the repository:
git clone https://github.com/CSOgroup/HaploC-tools.git
- Download data files required by
HaploC-tools
(1000 Genomes reference panel haplotypes, genomic reference data) and put (or link) it underHaploC-tools
. The data files can be downloaded from zenodo.
# Within the HaploC-tools folder run:
wget --content-disposition https://zenodo.org/records/10446020/files/genomicData.tar.gz?download=1
tar -xvzf genomicData.tar.gz
You can also get a demo data folder to test HaploC-tools
with:
wget --content-disposition https://zenodo.org/records/10446020/files/demo_data.tar.gz?download=1
tar -xvzf demo_data.tar.gz
This can be saved outside the HaploC-tools
folder and be used in the following steps to run an end-to-end analysis.
- Create conda environment with specified dependencies
conda env create -f HaploC-tools/environments/env_HapCUT2.yml # environment required for HapCUT2-based operations
conda env create -f HaploC-tools/environments/env_nHapCUT2.yml # environment required for all other operations
To fasten up the procedure, you can consider using mamba, a drop-in replacement for conda
.
- Install required R packages under the nHapCUT2 environment:
- R.utils (>= 2.9.0),
- doParallel (>= 1.0.15),
- ape (>= 5.3),
- dendextend (>= 1.12.0),
- fitdistrplus (>= 1.0.14),
- igraph (>= 1.2.4.1),
- Matrix (>= 1.2.17),
- rARPACK (>= 0.11.0),
- factoextra (>= 1.0.5),
- data.table (>= 1.12.2),
- fields (>= 9.8.3),
- GenomicRanges (>= 1.36.0)
- ggplot2 (>= 3.3.5)
- strawr (>= 0.0.9)
Also install CALDER R packages under the nHapCUT2 environment:
install.packages("HaploC-tools/CALDER2/", repos = NULL, type = "source")
Packages can be installed and checked with the following steps:
First, activate the nHapCUT2
env.
conda activate nHapCUT2
Next, execute the following:
install_if_needed <- function(package, version = NULL) {
if (!require(package, character.only = TRUE)) {
if (!is.null(version)) {
package <- paste0(package, "_", version)
}
install.packages(package, dependencies = TRUE)
}
}
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
cran_packages <-
c(
"R.utils",
"doParallel",
"ape",
"dendextend",
"fitdistrplus",
"Matrix",
"rARPACK",
"factoextra",
"data.table",
"fields",
"ggplot2",
"strawr"
)
## Install igraph with conda as it requires specific libraries
system("conda install conda-forge::r-igraph")
bioconductor_packages <- c("GenomicRanges")
for (pkg in cran_packages) {
install_if_needed(pkg)
}
for (pkg in bioconductor_packages) {
BiocManager::install(pkg)
}
# Install calder
install.packages("./HaploC-tools/CALDER2/", repos = NULL, type = "source")
## Check
pkgs <- c(cran_packages, bioconductor_packages, "CALDER")
check_package <- function(package) {
if (!require(package, character.only = TRUE)) {
cat(sprintf("Package '%s' is not installed.\n", package))
return(FALSE)
} else {
cat(sprintf("Package '%s' is installed. ", package))
success <- require(package, character.only = TRUE)
if (success) {
cat("Loading successful.\n")
return(TRUE)
} else {
cat("Loading failed.\n")
return(FALSE)
}
}
}
# Check each package
results <- sapply(pkgs, check_package)
# Print summary
cat("\nSummary of package checks:\n")
print(results)
if (all(results)) {
print("All packages correctly installed!")
} else {
print("Some packages can't be loaded or installed, please check.")
}
Please ensure you are in the directory containing HaploC-tools
to install CALDER
based on its path successfully.
If all packages are installed and working properly, you should get the "All packages correctly installed!" message, and you can proceed.