Please create a training account by signing up at the following link: https://iris.nersc.gov/train
Make sure to remember your username and password generated at the end of the sign up process.
Ensure the login works by opening a terminal and logging on to Perlmutter, using
ssh <username>@perlmutter.nersc.gov
Use this image: ghcr.io/1tnguyen/cuda-quantum:mpich-231710
Pull in the image using:
shifterimg -v pull <image>
Note that this image has been configured to utilize the cuda-aware CRAY MPICH on Perlmutter.
To see the image:
shifterimg images | grep -i "cuda-quantum"
To ask for an interactive allocation using commandline (request 1 node with 4 gpus, spawn off 1 task per gpu with each gpu being visible to each task):
salloc -N 1 --gpus-per-task=1 --ntasks-per-node=4 --gpu-bind=none -t 120 --qos=interactive -A <project_name> -C gpu --module=cuda-mpich --image=<image>
Once the allocation goes through, you should see the directory from where you launched the job.
To run with a single gpu:
shifter python <filename> --target nvidia
To run with `--target nvidia-mgpu` on 1 node and 4 gpus:
srun -N 1 -n 4 shifter python <filename> --target nvidia-mgpu
Use this script to run a multi-node multi-gpu simulation.
To run with a single gpu, replace the srun
line with
shifter python ghz.py --target nvidia
Finally, to submit your job to the queue, use sbatch <multinode_script>