Slurm Commands

Read:

Slurm is MSI’s job scheduling system. It is responsible for managing the allocation of computing resources among MSI users and user groups. In the CDNI, most data processing and analysis tasks are executed via Slurm job submissions, in the form of batch scripts submitted with the sbatch command. A batch script is a text file that specifies both the resources (e.g. CPUs, RAM, time) and the commands to be run for a processing job. Slurm jobs can be used for any process that requires more computing resources than is available within a normal terminal.

Slurm also has “interactive” jobs (srun), which allow access to compute resources directly from the terminal, as opposed to batch job submissions which run in the background (sbatch).

Note

Processes on login nodes are automatically terminated after 15 minutes.

Read more about login vs compute nodes on this page

Also available are various commands for job accounting, job management, and environment configuration. (See cheat sheet linked below)

Job Parameters

Below is a table summarizing some commands that can be used inside Slurm job scripts (see sruns and sbatch). The first four commands are required, while the other commands are optional. See here for a larger list of options. For parameter optimization, refer to using seff in this section.

Slurm command	Effect
#!/bin/bash -l	Required for sbatch: Specifies how the Slurm file should be read (by the bash interpreter). A statement like this is required to be the first time of a Slurm script
-t 8:00:00 --time=8:00:00	Required: Specifies the maximum limit for how long the job will be allowed to run
--ntasks=8	Required: Specifies the number of processors (cores) that will be reserved for the job
-A, --account=share	Optional: Charge resources used by this job to the specified account. The account may be changed after job submission using the scontrol command. To choose an optimal share, see the Fairshare explanation here.
-c 3 --cpus-per-task=3	Optional: Choose how many processors on a node are needed for a task. By default, SLURM will just try to allocate one processor per task. Say a job had 4 tasks that each need 3 processors and the cluster had a quad-processor node, if you simply ask for 12 processes, SLURM will only give 3 nodes. With this option, SLURM knows that each task requires 3 processors on the same node, and will give one node per task.
--mem=10g	Optional: Specifies the maximum limit for memory usage for the entire job. This job will die if the application tries to use more than 10GB of memory*
--mem-per-cpu=<size>[units]	Optional: Minimum memory required per allocated CPU.
--tmp=10g	Optional: Specifies 10GB of temporary disk space will be available for this job in /tmp. Should only be used if you are specifying /tmp folders for inputs, outputs, or working directories*
--mail-type=ALL	Optional: Specifies which events will trigger an email message. Other options here include NONE, BEGIN, END, and FAIL. Not recommended for 100+ subject jobs
--mail-user=x500@umn.edu	Optional: Specifies the email address that should be used when the Slurm system sends message emails. Make sure to double check each time. People frequently get emails from others due to other people copying and running their sbatches
-p small,mygroup --partition=small,mygroup	Optional: Specifies the partition to be the “small” or "mygroup" partition. The job will start at the earliest time one of these partitions can accommodate the job. You must be logged into the correct cluster access corresponding partitions. For more info, see here.
--gres=gpu:v100:2 -p v100	Optional: Request two v100 GPUs for a job submitted to the V100 group

*--mem is the amount of RAM (random access memory) on a CPU (central processing unit), while --tmp indicates the amount of temporary storage that you can utilize for a job. With whatever storage amount is specified for --tmp, that amount will be created for you within the /tmp folder to output your processing derivatives.

Job Status

squeue

squeue -al --me: determine specifc jobs for your own account

squeue -u username: view all jobs submitted by a given user

squeue -A group_name: check the queue for all groups one is under to determine which to submit under

squeue -u <username> -h -t pending,running -r | wc -l: count how many jobs you have in your queue, you can add more statuses if needed

scancel

scancel jobID_number: cancel a submitted job

Can also be used for job arrays by listing the job ID numbers as a comma separated list

sacct

sacct: display accounting data for all jobs and job steps in the Slurm job accounting log or Slurm database

sacct -X -j JOBID_ARRAY# -o JobID,NNodes,State,ExitCode,DerivedExitCode,Comment: check the status of a job even after it has exited, JOBID_ARRAY can also just be JOBID

scontrol

scontrol update JobId=#### Account=new_group: change job account for a submitted job

Each PI in the lab has their own Slurm group account with its own allocation of resources and queue priority. It is sometimes useful to change accounts to distribute resource requests for large processing jobs, or when an account has low queue priority due to heavy usage
Example command for moving Job 234293 that was originally submitted under miran045 and change it to feczk001: scontrol update JobId=234293 Account=feczk001

scontrol update JobId=#### Partition=new_partition: change a SLURM job parition

Example command for moving Job 234293 that was originally submitted with the msismall parition and change it to msiqgu: scontrol update JobId=234293 Partition=msigpu

NOTE: scontrol cannot be used to change between agate and mesabi paritions once a job has been submitted. It can be used to change from agate/mesabi to federated paritions

scontrol update JobId=#### EndTime=HH:MM:SS: change the amount of time a SLURM job runs

Example command for moving Job 234293 that was originally submitted at the following time for 96 hours: StartTime=2022-08-29T13:04:45 and change it to 48 hours: scontrol update JobId=234293 EndTime=2022-08-31T13:04:45

scontrol show JobId=####: find time information for a job