partitions

Discovery Cluster Partitions

General Purpose Partitions
Approval Access Partitions
Dedicated Access Partitions
Using Constraints
Retrieving a Node's Hardware Configuration

General Purpose Partitions

In the Discovery environment, collections of compute nodes are organized into “partitions”. Always submit jobs to specific partitions using sbatch, or srun. Information on Discovery Cluster partitions can be found here. Current usage and node availability can be displayed by typing sinfo: the NODELIST column in sinfo indicates the names of corresponding machines. The names of the public partitions are:

-debug 
-express 
-short 
-long 
-gpu
-multigpu
-large

A summary of a few commonly used partitions is below:

Partition Name	Old Name
debug, express, short, long, large	general, infiniband, ser-par-10g-2, ser-par-10g-3, ser-par-10g-4, ht-10g, largemem-10g, interactive-10g

Timing, memory, and core limits for commonly used partitions are as follows:

Partition Name	Requires Approval?	Time Limit (default/max)	Core Limit	RAM Limit	Running Jobs (default/max)	Submitted Jobs (default/max)
debug	No	20min/20min	128	256GB	10/25	25/100
express	No	30min/1h	2048	25TB	50/250	250/1000
short	No	4h/24h	1024	25TB	50/500	100/1000
long	Yes	1day/5days	1024	25TB	25/250	50/500
large	Yes	6h/6h	N/A	N/A	100/100	100/1000
gpu	No	4h/8h	N/A	N/A	25/250	50/1000

These partitions contain several different types of machines, accessible via appropriate --constraint calls (see Using Constraints):

Partition Name	CPU	Frequency	Core Number	Memory Available	Constraint
debug, express, short, long, large	Dual Intel Xeon E5-2650	2.00GHz	16	128GB	E5-2650@2.00GHz
	Dual Intel Xeon E5-2680 v2	2.80GHz	20	128GB	E5-2680v2@2.80GHz
	Dual Intel Xeon E5-2690 v3	2.60GHz	24	128GB	E5-2690v3@2.60GHz

Constraints of the gpu partition are as follows:

Partition Name	CPU+GPU	Frequency	Core Number	Memory Available	Constraint
gpu	Dual Intel Xeon E5-2650+ one K20m NVIDIA GPU (23 nodes)	2.00GHz	16	128GB	E5-2650@2.00GHz
	Dual Intel Xeon E5-2690v3+ one K40m NVIDIA GPU (16 nodes)	2.60GHz	24	128GB	E5-2690v3@2.60GHz
	Dual Intel Xeon E5-2680v4+ 8 k80 NVIDIA GPU (8 nodes)	2.40GHz	28	256GB	E5-2680v4@2.40GHz
	Dual Intel Xeon E5-2680v4+ 4 p100 NVIDIA GPU (12 nodes)	2.40GHz	28	256GB	E5-2680v4@2.40GHz
	Intel Gold 6132 + 4 v100-sxm2 NVIDIA GPU (24 nodes)	2.60GHz	N/A	N/A	N/A

Approval Access Partitions

In order to ensure fair distribution of resources, access to the long, large, and multigpu partitions is restricted to researchers who have demonstrated their need to use these partitions.

To request access to the long partition, open a general Research Computing ServiceNow ticket, detailing your requirements for needing access to the long partition. IT will be in contact with you and will require that you meet with a member of our staff for a consultation. Be prepared to share your code and an example of previous jobs that you’ve attempted to run. Note that if your code is easily check pointed, you are not a good candidate for using the long partition.

To request access to the multigpu partition, download and complete all parts of the multigpu access form located here. Attach this form to a general Research Computing ServiceNow ticket. Your request will be reviewed by two faculty members, and you will be notified of your application’s acceptance or rejection through the ServiceNow ticket that you submitted.

To request access to the large partition, download and complete all parts of the large partition access form located here. Attach this form to a general Research Computing ServiceNow ticket. Your request will be reviewed by a faculty member, and you will be notified of your application’s acceptance or rejection through the ServiceNow ticket that you submitted.

Dedicated Access Partitions

Several partitions are owned by ECE faculty, and access to them is restricted: their use is permitted only after obtaining explicit access from the respective owners. Information on these partitions can be found below:

Partition Name	Machines	Name Range	Old Name	CPUs per Machine	RAM	CPU	Constraint
ioannidis	8	`c[3096-3103]`	ioannidis1	40	128GB	Intel Xeon CPU E5-2680v2 2.8GHz	E5-2680v2@2.80GHz
	8	`c[3120-3127]`	ioannidis2	56	500GB	Intel Xeon CPU E5-2680v4 2.4GHz	E5-2680v4@2.40GHz
danabrooks	1	`c[4021]`	danabrooks	96	256GB	Intel Xeon CPU E7-4830v3 2.8GHz	E7-4830v3@2.8GHz

Using Constraints

As there are many different CPU configurations in a partition, additional arguments need to be passed to use specific machines under either sbatch (i.e., in batch-mode) or srun (i.e., in interactive mode).

For example, to submit a job to a Dual Intel Xeon E5-2650 machine from the short partition, you should evoke sbatch with the following arguments:

#SBATCH --partition=short
#SBATCH --constraint=E5-2650@2.00GHz

As another example, if you want to use nodes in the old ioannidis1 partition, you should add the following constraints:

#SBATCH --partition=ioannidis
#SBATCH --constraint=E5-2680v2@2.80GHz

Appropriate constraints are listed in the tables above. Note that an alternative way of getting access to specific nodes is through their name. For example, submitting a job to specific node c3096 in the ioannidis1 partition (presuming c3096 is idle) can be done via:

#SBATCH --partition=ioannidis
#SBATCH -w c3096

This is useful if you are trying to ensure that your experiments run on the same machine.

Tip. See also --nodelist option for submitting jobs to multiple machines.

Retrieving a Node's Hardware Configuration

To get the CPU, memory, etc., configuration of all nodes with a single command, type

grep Feature /shared/centos7/etc/slurm/nodes.conf

from any node, including the gateway.

Alternatively, log in to a node in interactive mode and type:

lscpu

This will show you information about that node specifically.

Back to main page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly