Skip to content

Commit

Permalink
Change Paralell lesson (#118)
Browse files Browse the repository at this point in the history
* Removed uneeded second  R Script

* Rename test script to 'sum_matrix' to avoid confusion with job arrays. Also made script work with MPI.

* make naming consistant.
  • Loading branch information
CallumWalley authored Oct 6, 2024
1 parent 63a2fe8 commit cba1c13
Show file tree
Hide file tree
Showing 20 changed files with 137 additions and 103 deletions.
2 changes: 1 addition & 1 deletion SETUP_FOR_INSTRUCTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Get files.

```
wget https://raw.githubusercontent.com/nesi/hpc-intro/gh-pages-nesi/_includes/example_scripts/example-job.sh
wget https://raw.githubusercontent.com/nesi/hpc-intro/gh-pages-nesi/_includes/example_scripts/array_sum2.r -O {{ site.example.script }}
wget https://raw.githubusercontent.com/nesi/hpc-intro/gh-pages-nesi/_includes/example_scripts/matrix_sum.r -O {{ site.example.script }}
wget https://raw.githubusercontent.com/nesi/hpc-intro/gh-pages-nesi/_includes/example_scripts/whothis.sh
wget
wget
Expand Down
3 changes: 2 additions & 1 deletion _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ sched:
example:
lang: "R"
shell: "Rscript "
script: "array_sum.r"
script: "sum_matrix.r"
module: "R/4.3.1-gimkl-2022a"

# For 'python'
Expand All @@ -89,6 +89,7 @@ episode_order:
- 09-scaling



#------------------------------------------------------------
# Values for this lesson
#------------------------------------------------------------
Expand Down
43 changes: 33 additions & 10 deletions _episodes/064-parallel.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ This means that all CPUs must be on the same node, most Mahuika nodes have 72 CP
Shared memory parallelism is what is used in our example script `{{ site.example.script }}`.

Number of threads to use is specified by the Slurm option `--cpus-per-task`.

<!--
> ## Shared Memory Example
>
> Create a new script called `smp-job.sl`
Expand Down Expand Up @@ -98,7 +98,7 @@ Number of threads to use is specified by the Slurm option `--cpus-per-task`.
> >
> > {: .output}
> {: .solution}
{: .challenge}
{: .challenge} -->

### Distributed-Memory (MPI)

Expand All @@ -114,7 +114,7 @@ Number of tasks to use is specified by the Slurm option `--ntasks`, because the

Tasks cannot share cores, this means in most circumstances leaving `--cpus-per-task` unspecified will get you `2`.

> ## Distributed Memory Example
<!-- > ## Distributed Memory Example
>
> Create a new script called `mpi-job.sl`
>
Expand Down Expand Up @@ -153,10 +153,10 @@ Tasks cannot share cores, this means in most circumstances leaving `--cpus-per-t
> > ```
> > {: .output}
> {: .solution}
{: .challenge}
{: .challenge} -->

Using a combination of Shared and Distributed memory is called _Hybrid Parallel_.

<!--
> ## Hybrid Example
>
> Create a new script called `hybrid-job.sl`
Expand Down Expand Up @@ -193,7 +193,7 @@ Using a combination of Shared and Distributed memory is called _Hybrid Parallel_
> > ```
> > {: .output}
> {: .solution}
{: .challenge}
{: .challenge} -->

### GPGPU's

Expand All @@ -209,7 +209,7 @@ GPUs can be requested using `--gpus-per-node=<gpu_type>:<gpu_number>`

Depending on the GPU type, we *may* also need to specify a partition using `--partition`.

> ## GPU Job Example
<!-- > ## GPU Job Example
>
> Create a new script called `gpu-job.sl`
>
Expand Down Expand Up @@ -266,7 +266,7 @@ Depending on the GPU type, we *may* also need to specify a partition using `--pa
> > ```
> > {: .output}
> {: .solution}
{: .challenge}
{: .challenge} -->

### Job Array

Expand All @@ -282,7 +282,7 @@ A job array can be specified using `--array`

If you are writing your own code, then this is something you will probably have to specify yourself.

> ## Job Array Example
<!-- > ## Job Array Example
>
> Create a new script called `array-job.sl`
>
Expand Down Expand Up @@ -331,7 +331,30 @@ If you are writing your own code, then this is something you will probably have
> > ```
> > {: .output}
> {: .solution}
{: .challenge}
{: .challenge} -->

## Summary

| Name | Other Names | Slurm Option | Pros/cons |
| - | - | - | - |
| Shared Memory Parallelism | Multithreading, Multiproccessing | `--cpus-per-task` | |
| Distrubuted Memory Parallelism | MPI, OpenMPI | `--ntasks` and add `srun` before command | |
| Hybrid | | `--ntasks` and `--cpus-per-task` and add `srun` before command | |
| Job Array | | `--array` | |
| General Purpose GPU | | `--gpus-per-node` | |

> ## Running a Parallel Job.
>
> Pick one of the method of Paralellism mentioned above, and modify your `example.sl` script to use this method.
>
>
>
> > ## Solution
> >
> > What does the printout say at the start of your job about number and location of node.
> > {: .output}
> {: .solution}
{: .challenge}

## How to Utilise Multiple CPUs

Expand Down
2 changes: 1 addition & 1 deletion _episodes/095-writing-good-code.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,7 @@ set.seed(seed)
Now your script should look something like this;

```
{% include example_scripts/array_sum2.r %}
{% include example_scripts/sum_matrix.r %}
```
{: .language-r}

Expand Down
19 changes: 0 additions & 19 deletions _includes/example_scripts/array_sum.r

This file was deleted.

28 changes: 0 additions & 28 deletions _includes/example_scripts/array_sum2.r

This file was deleted.

6 changes: 0 additions & 6 deletions _includes/example_scripts/example-job.sh

This file was deleted.

11 changes: 0 additions & 11 deletions _includes/example_scripts/example-job.sl.1

This file was deleted.

11 changes: 11 additions & 0 deletions _includes/example_scripts/example_dmp.sl
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash -e

#SBATCH --job-name dmp_job
#SBATCH --output %x.out
#SBATCH --mem-per-cpu 500
#SBATCH --ntasks 4

module purge
module load R/4.3.1-gimkl-2022a
srun Rscript sum_matrix.r
echo "Done!"
5 changes: 4 additions & 1 deletion _includes/example_scripts/example_hybrid.sl
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,7 @@
#SBATCH --ntasks 2
#SBATCH --cpus-per-task 4

srun echo "I am task #${SLURM_PROCID} running on node '$(hostname)' with $(nproc) CPUs"
module purge
module load R/4.3.1-gimkl-2022a
srun Rscript sum_matrix.r
echo "Done!"
6 changes: 6 additions & 0 deletions _includes/example_scripts/example_job.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash -e

module purge
module load R/4.3.1-gimkl-2022a
Rscript sum_matrix.r
echo "Done!"
11 changes: 11 additions & 0 deletions _includes/example_scripts/example_job.sl.1
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash -e

#SBATCH --job-name example_job
#SBATCH --account nesi99991
#SBATCH --mem 300M
#SBATCH --time 00:15:00

module purge
module load R/4.3.1-gimkl-2022a
Rscript sum_matrix.r
echo "Done!"
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
#!/bin/bash -e

#SBATCH --job-name my_job
#SBATCH --job-name example_job
#SBATCH --account nesi99991
#SBATCH --mem 300M
#SBATCH --time 00:15:00
#SBATCH --cpus-per-task 4

module purge
module load R/4.3.1-gimkl-2022a
{{ site.example.shell }} {{ site.example.script }}
Rscript sum_matrix.r
echo "Done!"
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
#!/bin/bash -e

#SBATCH --job-name my_job
#SBATCH --job-name example_job
#SBATCH --account nesi99991
#SBATCH --mem 600M
#SBATCH --time 00:10:00
#SBATCH --cpus-per-task 4

module purge
module load R/4.3.1-gimkl-2022a
{{ site.example.shell }} {{ site.example.script }}
Rscript sum_matrix.r
echo "Done!"
5 changes: 4 additions & 1 deletion _includes/example_scripts/example_jobarray.sl
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,7 @@
#SBATCH --mem-per-cpu 500
#SBATCH --array 0-3

srun echo "I am task #${SLURM_PROCID} running on node '$(hostname)' with $(nproc) CPUs"
module purge
module load R/4.3.1-gimkl-2022a
Rscript sum_matrix.r
echo "Done!"
8 changes: 0 additions & 8 deletions _includes/example_scripts/example_mpi.sl

This file was deleted.

7 changes: 5 additions & 2 deletions _includes/example_scripts/example_smp.sl
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
#!/bin/bash -e

#SBATCH --job-name smp_job
#SBATCH --job-name smp
#SBATCH --account nesi99991
#SBATCH --output %x.out
#SBATCH --mem-per-cpu 500
#SBATCH --cpus-per-task 8

echo "I am task #${SLURM_PROCID} running on node '$(hostname)' with $(nproc) CPUs"
module purge
module load R/4.3.1-gimkl-2022a
Rscript sum_matrix.r
echo "Done!"
10 changes: 0 additions & 10 deletions _includes/example_scripts/shared-mem-job.sl

This file was deleted.

File renamed without changes.
55 changes: 55 additions & 0 deletions _includes/example_scripts/sum_matrix.r
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
#!/usr/bin/env Rscript


# Function for shared memory execution
doTask <- function(size_x, size_y, seed, print_progress){
suppressPackageStartupMessages(library(doParallel))

message(sprintf("Summing [ %e x %e ] matrix, seed = '%i'",size_x,size_y, seed))
message(sprintf("Running on '%s' with %i CPU(s).", Sys.info()["nodename"], num_cpus))

set.seed(seed)

registerDoParallel((num_cpus/2))

results_all <- foreach(z=0:size_x) %dopar% {
percent_complete= z*100/size_x
if (print_progress && percent_complete%%1==0){
message(sprintf(" %i%% done...\r", percent_complete))
}
sum(rnorm(size_y))
}
Reduce("+",results_all)
}

# 50 calculations, store the result in 'x'

ntasks <- strtoi(Sys.getenv('SLURM_NTASKS', unset = "1"))
seed <- strtoi(Sys.getenv('SLURM_ARRAY_TASK_ID', unset = "0"))
num_cpus <- as.integer(strtoi(Sys.getenv('SLURM_CPUS_PER_TASK', unset = "1")))

size_x <-60000 # This on makes memorier
size_y <-40000 # This one to make longer

# Time = (size_x/n) * size_y + c
# Mem = (size_x * n) * c1 + size_y * c2

print_progress <- TRUE
# print_progress <- interactive() # Whether to print progress or not.

#If more than 1 task, use doMPI
if (ntasks > 1){
suppressPackageStartupMessages(library(doSNOW))
cl <- makeMPIcluster(outfile="")

results_all <- foreach(z=1:ntasks) %dopar% {
doTask(size_x, ceiling(size_y/ntasks), z+seed, print_progress)
}

results = Reduce("+",results_all)
stopCluster(cl)
message(sprintf("Sums to %f", results))
}else{
results = doTask(size_x, size_y, seed, print_progress)
message(sprintf("Sums to %f", results))
}

0 comments on commit cba1c13

Please sign in to comment.