From ee7f15cffe98346df583618a29027792da2b7e02 Mon Sep 17 00:00:00 2001 From: danilotat Date: Sun, 24 Nov 2024 18:29:26 +0100 Subject: [PATCH] enh: snakemake >8.0.0 execution on hpc --- docs/hpc.md | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/docs/hpc.md b/docs/hpc.md index ef5580a..ed338b7 100644 --- a/docs/hpc.md +++ b/docs/hpc.md @@ -1,6 +1,32 @@ # Run on HPC -ENEO was developed and tested in High Performances Computing (HPC) clusters with the SLURM workload manager. Even if Snakemake introduced plugins in version `>8.0`, still the preferred way to launch the workload in SLURM is using a defined profile. +ENEO was developed and tested in High Performances Computing (HPC) clusters with the SLURM workload manager. Snakemake deeply changed the job submissions and handling after the major update introduced with the version 8.0.0. Currently more than a single way exists for submitting jobs using Snakemake, but the most effective one seems to be using the `cluster-generic` plugin. + +If you're using Snakemake > 8.0.0, install the cluster-generic plugin using pip + +``` +pip install snakemake-executor-plugin-cluster-generic +``` + +Then inside the folder `worflow/profile` you'll find for each of the supported method (SLURM/SGE) two configuration files: one with the string `v8` in the name, used by Snakemake version >8.0.0, and a legacy `config.yaml`, for older versions. + +!!! tip + +The following notes reported examples using the legacy config file. However, the relevant edits are the same! + +## Singularity args + +Two rules of the workflow (variant annotation and pMHC binding affinity estimation) depend on Singularity containers. It's key to ensure that all the relevant folders are readable/writable within each container. For this reason, multiple folders are required to be mounted, as Snakemake is *lazy* in assigning mountpoints. + +Populate the last entry of the config file, `singularity-args`, adding the absolute path for: + + - the resources directory + - the temporary directory + - the output directory + - the workflow directory + +Additionally, you had to set the TMPDIR environment variable to the temporary directory, to avoid writing permissions in the last step. + ## SLURM