Skip to content

batch mode

Ilkay Yildiz edited this page Feb 4, 2020 · 24 revisions

Batch Mode

Overview

To submit multiple non interactive jobs simultaneously, it is best to use sbatch. To do so, you first need to create an sbatch script, and then execute it using sbatch

Creating a script

An sbatch script is a file that has the following form:

#!/bin/bash

#SBATCH (... sbatch options) 

module load (... load modules that you wish to use when running the script)

srun (...run main job in the script)

Lines in this script preceded by #SBATCH should be used to select certain options that will always be run with this script. Examples include

#SBATCH --job-name=my_nice_job                    # sets the job name
#SBATCH --exclusive                               # reserves a machine for exclusive use
#SBATCH --nodes=5                                 # reserves 5 machines
#SBATCH --tasks-per-node=2                        # sets 2 tasks for each machine
#SBATCH --cpus-per-task=1                         # sets 1 core for each task
#SBATCH --mem=100Gb                               # reserves 100 GB memory
#SBATCH --partition=my_partition                  # requests that the job is executed in partition my partition
#SBATCH --time=4:00:00                            # reserves machines/cores for 4 hours.
#SBATCH --output=my_nice_job.%j.out               # sets the standard output to be stored in file my_nice_job.%j.out, where %j is the job id)
#SBATCH --error=my_nice_job.%j.err                # sets the standard error to be stored in file my_nice_job.%j.err, where %j is the job id)
#SBATCH --exclude=c0100                           # excluding node c0100
#SBATCH --gres=gpu:1                              # reserves 1 gpu per machine
#SBATCH --constraint=“E5-2690v3@2.60GHz”          # only consider reserving the machines that has Intel E5-2690v3 chip)
#SBATCH --nodelist=c0[100-200]                    # only consider reserving the machines c0100-c0200

Caution: ALWAYS SET mem!!! Note that even if you take a node exclusively but yor job consumes more then the mem value (which is by default only few megabytes) SLURM kills your job. More options can be found by typing man sbatch.

Finally, the job (or jobs, if there are many jobs to be executed) can be listed at the end of the script, preceded by srun. For example, to run a python program called my_program.py you need to add:

srun python my_program.py

Running your script

Once you have created your script, you can run it by calling

sbatch myscript

This submits your script to the default partition (or the partition specified in the script through an #SBATCH --partition setting. Any parameter that can be set inside the script can also be set by passing it directly to sbatch. For example:

sbatch --partition=short --nodelist=c0172 --cpus-per-task=1 myscript

executes the script on c0100, reserving only one core, while

sbatch --partition=short --nodelist=c0172 --exclusive myscript

reserves the node for exclusive use (no other job can be submitted in the same time at c0100).

Tip: The examples above illustrate that options specified in an sbatch script can also be passed outside the script when calling sbatch. If you expect that all executions of the script will use the same parameter (e.g., partition, name, number of cores etc), write these directly in the script. If you expect these to vary from one execution to the next, leave these to be determined outside the script, once you run sbatch. In any case, options passed to sbatch from the command line override options inside the sbatch script.

Monitoring and terminating jobs

To monitor whether your script was submitted succesfully, you can run:

squeue -u $USER 

where $USER is your user id. This will show you information on your script, including its job id, whether it is pending (P) or running (R), as well as the machine it is running on. You can terminate your script by typing:

scancel jobid

where jobid is its id.

To reschedule a job "re-queued in held state:"

scontrol release <jobid>

To monitor the computational efficiency of your job, you can run:

seff <job id>

Tip: On most partitions, jobs submitted have a time limit of 24 hours.

Examples

Example 1

Suppose that you have written a python program called my_program.pythat reads a text file (say myfile) provided by the command line, removes all spaces, and prints it in the standard output. Normally (e.g., in interactive mode) you would execute this as:

srun --partition=my_partition  --mem=10Gb my_program.py myfile

and the result would be printed right below.

You have 100 files in a directory called input/, and would like all of them to be processed by this program in parallel. To do so, you can create the following script, called, e.g. my_script

#!/bin/bash
#SBATCH --job-name=my_script
#SBATCH --cpus-per-task=1
#SBATCH --mem=10Gb
#SBATCH --output=my_script.%j.out
#SBATCH --error=my_script.%j.err

module load python/2.7.15

srun python my-program.py $1

The $1 above refers to the first command line argument passed to the script. Then, calling:

sbatch --partition=my_partition my_script my_file

will execute my_script over my_file on a machine on partition my_partition. This will occupy exactly one core (due to the --cpus-per-task=1 option), and any output printed in either the standard error and standard output will be appropriately directed to the files specified in the batch comments. In particular, if the job created by this is 27773, the text without spaces will be stored in my_script.27773.out.

Processing all files in directory input in parallel can be done by using a bash for loop as follows:

for file in `ls input`; do sbatch -partition=my_partition my_script $file; done

Example 2

Suppose you want to calculate the function value f(alpha,beta,gamma) for alpha ranging from 0 to 10, beta be either "0.001" "0.004" or "0.007", and 'gamma' ranging from 0 to 10. You can create a bash script that performs all these computations in parallel. The min bash script, called main.bash, looks like this:

#!/bin/bash
for alpha in `seq 0 1 10`
do 
    for beta in "0.001" "0.004" "0.007" 
    do 
        for gamma in `seq 0 1 10`
        do
            work=/scratch/username/file/
            cd $work
            sbatch execute.bash $alpha $beta $gamma
        done
    done
done 

main.bash calls execute.bash, which includes this sbatch script:

#!/bin/bash
#set a job name 
#SBATCH --job-name=run1
#a file for job output, you can check job progress
#SBATCH --output=run1_%j.out
# a file for errors from the job
#SBATCH --error=run1_%j.err
#time you think you need: default is one day 
#in minutes in this case, hh:mm:ss
#SBATCH --time=24:00:00
#number of cores you are requesting 
#SBATCH --cpus-per-task=1
#memory you are requesting
#SBATCH --mem=10Gb
#partition to use 
#SBATCH --partition=short

module load python-2.7.5

srun python main.py $1 $2 $3

According to the discovery cluster usage policy, total number of jobs per user each time on one public partition (short etc.) is 100, you should not exceed the job limit. But there is no limit on private faculty partition. For public node, the execution time limit is 24 hours. For the private faculty node there is no time limit.

An example python file main.py is:

import sys


def f(a,b,c):
    return a+b+c

if __name__=="__main__":
    alpha= float(sys.argv[1])
    beta=float(sys.argv[2])
    gamma=float(sys.argv[3])

    print alpha,'+',beta,'+',gamma,'=',f(alpha,beta,gamma)

You can submit these batch jobs by running:

./main.bash

Additional Resources

More examples can be found here. To learn more about general bash scripts you can have a look at this tutorial. More on sbatch can be found here.