Lab 5: Launching a job on NERSC from the command line

This lab session is intended to show you how to launch a job from the command line on carver. We will also show you scripts that allow you to run multiple jobs.

Launching a job on NERSC

Log onto carver

Log onto carver.nersc.gov:

ssh
<username>@carver.nersc.gov

where username is the username assigned to you by NERSC.

Running Jobs on Bassi

The normal way to run jobs on carver without using eclipes is to submit batch jobs. Batch jobs can be submitted using the qsub command. The qsub command expects a script name as an argument. The script contains commands and the batch system or Torque keywords. A sample script is available here. The important variables declared in the script are:

  1. #PBS -q regular
    This tells they system you would like to use regular mode. Debug mode will get you through the queue much quicker, but you can only queue about 2 jobs at a time using this mode. If you are running many jobs to determine runtime, use regular.
  2. #PBS -l nodes=1:ppn=8
    This tells the system you would like to request 1 node and use its 8 processors.
  3. #PBS -l walltime=00:05:00
    This tells the system you want it to kill you job if it runs over 5 minutes. Keeping this low, at about 5 minutes, will help you not use up all your system time if you have a program that doesn't end.
  4. #PBS -N jacobi_4
    This is the name of your job.
  5. #PBS -e err/jacobi_4.err
    This is the name of your error file.
  6. #PBS -o out/jacobi_4.out
    This must be the last variable in the script.

The last line of the script is the executable to be executed.

  1. cd \$PBS_O_WORKDIR
    This changes the current working directory to the directory from which the script was submitted.
  2. setenv OMP_NUM_THREADS 4
    This causes open mp to use 4 processors.
  3. ./jacobi 100 100
    This is the code you want to run and its command line arguments.

To submit this job use qsub followed by the name of the script.

Use the qs command to check the status of your job. Use the -u option to see only jobs that you have submitted:

qs -u <username>

Do remove one of your jobs from the queue use qdel followed by the job id.

A more complete description is available at the NERSC website.

Script to queue up multiple jobs

A sample script for queueing up multiple jobs can be found here. It loops over the number of processors creates a batch script for each one and submits the scripts.

Try it

Take the sample batch script and modify it to run your mandelbrot code in debug mode and try submitting it. Then take the sample script for queueing multiple jobs. Modify it to run on your mandelbrot code and for all processors between 1 and 8. Run it, but then change your mind, find your jobs in the queue and remove them.