Skip to Main Content

Department of Computer Science

Technical Services and Support

Scheduler for GPU jobs

We have a set of machines intended for GPU jobs and jobs that take large memory. Usage is coordinated by a scheduler, using the Slurm scheduling system. The systems are on iLab1.cs.rutgers.edu – iLab4.cs.rutgers.edu and  rLab1.cs.rutgers.edurLab4.cs.rutgers.edu with a total of 64 high-end GPUs – 32 RTX A4000, 24 GeForce GTX 1080 TI and 8 Nvidia TITAN X.

It doesn’t matter which system you log into. The scheduler will put your job on a system with free resources. The scheduler tries to give each user a fair share of the system. It also gives priority to jobs that are shorter or use fewer GPUs. If the resources are available, you may run several jobs, up to its individual limit, at the same time.

With slurm, the Limitation Enforced on CS Linux Machines for long job and memory do not apply to jobs scheduled via sbatch or srun. You tell the scheduler how many GPUs (up to 4), how much memory you need (default to 80GB and up to about 1TB) and how long your job will last (up to 7 days), It enforces the limits. 


Getting Started

You can use slurm job scheduler in interactive and batch mode. Each one has its own pros and cons.

Interactive Session

Interactive session must be run of a command line or a terminal.

    • The simplest approach to using it, is to ask for an interactive session. Type

srun -G 4 --pty /bin/bash

-G indicates how many GPUs you want. Currently you can get anything from 1 to 4. If GPUs are available, you’ll get them. Note that you may end up on a different computer from where you typed the srun command, depending upon where there are free GPUs.

    • If no GPU is free, you have two options. One is to wait until one is free. Try:

srun --mail-type=BEGIN -G 4 --pty /bin/bash

You’ll get email when GPUs are available. Make sure you use your reservation as soon as possible, as slurm will remove your job if you don’t actually use GPUs within 24 hours.

    • If you want to see how many people are waiting, use the squeue command. Note that this command will show both people waiting and jobs currently running.
    • The session won’t necessarily be on the same computer as the one you typed srun on. It will be like you did ssh to that computer. To check the computer you’re actually on, type hostname.
    • If you need to use graphics, find out which computer your session is on. Use weblogin.cs.rutgers.edu or Windows RemoteDesktop client to start a graphical session on that computer. In a terminal window in this session type echo $DISPLAY. Now in the initial shell  you got from the srun command, type export DISPLAY=xxx where xxx is whatever the echo command showed, probably something like :10.0. Now anything you run from that shell will be displayed in the graphics window.
    • If you find it confusing to have the command line on one computer and the display on the other, you can type gnome-terminal in the initial shell. That will give you a window in the graphical session, with access to the GPUs
Batch Mode (Recommended)

Batch mode requires you to put your commands in a file and run it as a batch job. Once submitted, your job will start as soon as GPUs are available:

    • Put the commands you want to execute in a file, e.g. myJob
    • Submit the job using sbatch -G 4 myJob, where the number after -G is the number of GPUs you want. (See below for large-memory jobs.)
    • You can see what jobs are running using the command squeue
    • You can cancel a job using scancel NNN, where NNN is the job number shown in squeue.
    • If there are a lot of jobs in the queue, you might want to test your job to make sure you haven’t made a mistake in the file. You can use sbatch myJob, i.e. without -G. However please cancel the job once you verify that it starts properly. These systems should only be used for jobs that use GPUs

What goes in your batch file

The file you submit with sbatch must contain every command you need to execute your program.

    • Remember, it may run on a different computer. It needs all the commands you’d have to type after logging in to get to the point where you can run.
    • It must begin with #!/bin/bash. We recommend using #!/bin/bash -l  (That’s a lowercase L, not a one.) That will cause it to read your .bash_profile, etc.
    • At a minimum it needs a cd to get to the directory with your files.
    • If you’re using python in an anaconda environment it needs “activate” for that environment.
    • You should probably include #SBATCH --output=FILE unless you prefer to type --output when you submit the job
    • Here’s an example
#!/bin/bash -l
#SBATCH --output=logfile

cd YOURDIR
activate YOURENV
python YOURPROGRAM

Additional Details
  • The maximum number of GPUs we let you allocate in a single job is currently 4. That will go up as we add more computers to the system. If you want to use more than 4, submit several jobs.
  • If a job runs longer than a week it will be killed if others want to use the GPUs. (There have been a few cases where someone needs to run a job longer than a week. If that’s essential, notify help@cs.rutgers.edu and we’ll find a way to avoid killing the job.)
  • The scheduler also controls memory. By default, we allocate jobs 80GB of memory. However you can specify less. E.g. when using sbatch add --mem=32g. In a few cases that might allow a job to run that otherwise couldn’t. You can specify up to 1TB, but if you do that your job can only run on one of the four systems, and it may have to wait if other jobs are using memory. Please do not specify large amounts of memory unless you absolutely need it, as it will limit what other people can do.
  • If you look at Slurm documentation, you’ll see lots of examples where all the commands in the file start with srun. That’s not necessary or even a good idea here. Usesbatchwhenever possible!
  • We have three kinds of GPU, which are known as RTX A4000, GeForce GTX 1080 TI and Nvidia TITAN X. However to make it easy for you to specify, we have defined the feature as below:
    • RTX A4000 as a4000 and ampere. These cards are in iLab1.cs – iLab4.cs
    • GeForce GTX 1080 TI as 1080ti and pascal. These cards are in rLab1.cs – rLab3.cs
    • Nvidia TITAN X as titanx and pascal. These cards are in rlab4.cs
  • If you wanted to use RTX A4000 you could specify -C a4000 or “-C ampere. If you want to use either a 1080 TI or a TITAN X, you could specify -C '1080ti|titanx' or -C pascal. Note that OR is specified by | . Pascal and Ampere are the architectures. Cards with the same architectures have the same features, but differ in the amount of memory and number of cores. To see a list of all nodes and their specific features, use sinfo -Nel. Note that the column with features is truncated.
  •  You can also request a specific node using -w NODE, e.g. -w rlab2.
  • The only resources we control are GPUs and memory. The scheduler makes no attempt to schedule CPUs. Slurm has the ability to run a single job across multiple computers. We don’t recommend using that. Instead, use multiple jobs.
  • You will probably want output of your program to go into a file. You can use -o FILENAME in the sbatch command to specify an output file.
  • You can run interactive jobs using srun. E.g. srun -G 4 --pty /bin/bash. Note that the job won’t necessarily run on the system where you do the srun command. It’s sort of like doing ssh to a node that has free GPUs. Of course you’ll only get a shell if there are currently enough free GPUs (and memory, if you ask for a lot of memory).
  • If you login to one of the systems and don’t use sbatch or srun, you won’t have access to any GPUs. However nvidia-smi will show you all the GPUs, so you can see what’s going on. From within a batch or srun job, nvidia-smi will only show you the GPUs you have allocated.
  • You can put options in the file. E.g. rather than using sbatch -G 4 -o logfile, you could put
       #SBATCH -G 4
       #SBATCH -o logfile

in the file. All #SBATCH lines must be at the beginning of the file (right after the #!/bin/bash).

Common Slurm commands:
  • sacct: show accounting data for all jobs and job steps
  • sacctmgr: view and modify Slurm account information
  • salloc: set an interactive job allocation
  • sattach: attach to a running job step
  • sbatch: submit a batch script to Slurm
  • scancel: cancel jobs, job arrays or job steps
  • scontrol: view or modify Slurm configuration and state.
  • sdiag: show scheduling statistics and timing parameters
  • sinfo: view information about Slurm nodes and partitions.
  • sprio: show the components of a job’s scheduling priority
  • squeue: show  jobs queues
  • sreport: show reports from job accounting  and statistics
  • srun: run task(s) across requested resources
  • sshare: show the shares and usage for each user
  • sstat: show the status information of a running job/step.
  • sview: graphical user interface to view and modify Slurm state
 
Further Reading: