User Tools

Site Tools


wiki:supercomputing2

USING SUMMIT

We will walk through the use of the Summit Supercomuputer in three steps:

  1. Logging Into Summit on the Login Node
  2. Loading up software on a Compile Node
  3. Submitting Jobs to a Compute Node

Logging Into Summit on the Login Node

To log into the summit, we will use the command line and the command secure shell.

ssh usage
ssh <addressOfRemoteServer>
ssh [-l <yourloginname>] <addressOfRemoteServer> #that's a lower case “L”

Password –> for the Summit system, you must type your password as: eID-password + “,” + DUOkey. For example, if your eID password is “ramsarecool” and the DUO ID is 123456, you would type:
ramsarecool,123456

:!: Exercise: Log into summit:

ssh csu-eID@colostate.edu login.rc.colorado.edu
# Provide eID,DUOkey

:!: Exercise: Explore summit:

$ whoami
$ hostname
$ pwd
$ ls
$ ls -alh
$ more README.mdwn

As a summit user, you have certain directories already set aside for your use:

summit directories

:!: Exercise: Let's explore your summit directories… <yourusernamehere> - input your username. If you have a colostate.edu address, this will be “eID@colostate.edu”

$ cd /projects/<yourusernamehere>
$ ls
$ more README.mdwn

:!: Exercise: We will be working today in the scratch space, navigate to your scratch space..

$ cd /scratch/summit/<yourusernamehere>
$ ls
$ more README.mdwn

Loading up software on a Compile Node

To use software on the summit system, we have two options:

  1. Use software that is available for shared use
  2. Install our own software.

Software that is available for shared use is already installed on the system. To use it, we need to simply load it as a 'module'. Let's list whether you have any modules loaded already:

:!: Exercise: Explore the module function.

$ module list
$ module avail  #List modules that are available for you to load given your currently loaded software.
$ module avail  #List all the modules on the system.

I prefer to load modules on the Compile Node. Let's move there first,

$ ssh scompile
$ hostname
$ pwd
$ module list

:!: Exercise: Let's load the R module

$ module list
$ module spider R
$ module load R
$ module list

:!: Independent exercise

  • Load up the module python, version 2.7.11.

Submitting Jobs to a Compute Node

We will submit jobs from the compile node. This seems to work best for me. To submit jobs on our local linux machines, we used to type out the command at the prompt, push return, and the job would start executing immediately. This is not how things work on the cluster.

On the cluster, we will put our jobs in a script and submit that script to a job scheduling program. The scheduling program will use our requests for the number of nodes and processors we want to use and assign us to compute node(s) and processors (aka cores) where the job will run. Depending on the type of hardware we want, we may need to be patient and wait until the hardware is available for use. While we are waiting for our job to start, we will be put into a 'queue'. Summit uses a fair use queue system in which your place in the queue is a function of (1) when you submitted the job, (2) what resources you have requested, and (3) your use of the system.

The job scheduling program we will be using is called slurm. Slurm is loaded up on the compile nodes already. (If you are on the login node, you need to load it as a module as slurm/summit).

The slurm software has a number of commands you can use:

$ sbatch <shell script>  #submit a job
$ squeue   #check all jobs that are running
$ squeue -u $USER #check just my jobs that are running
$ squeue -j <enterJobNumber> #check just this job
$ scancel -j <enterJobNumber> #cancel just this job

:!: 1) Example: submitting a job using slurm. Let's make a job to run.

  • Copy the following text into a shell script and name it printHelloWorld.sh
#!/bin/bash

# print out a friendly message
echo "Hello World!"
  • Execute the script using slurm like so:
$ sbatch printHelloWorld.sh
  • What just happened?
  • Can you find an output file? What is in the output file?

:!: 2) Example: submitting a slurm job with more options

We can add more options to our sbatch command:

$ sbatch --nodes=1 --ntasks=1 --partition=shas --qos=normal --time=4:00:00 --output=helloWorld_output.txt printHelloWorld.sh

That's getting pretty crazy pretty fast. Instead of doing this, we can append the options inside the printHelloWorld.sh script. In this case, the printHelloWorld.sh script would look like this..

#!/bin/bash

#SBATCH --job-name=helloWorld
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --partition=shas
#SBATCH --qos=debug
#SBATCH --time=0:01:00
#SBATCH --output=hello_world_output_%j.txt

# print out a friendly message
echo "Hello World!"

What does all this stuff mean?

#SBATCH --job-name=helloWorld    #We want to name the job "helloWorld"
#SBATCH --nodes=1                #We request to use 1 node (computer)
#SBATCH --ntasks=1               #We request to use 1 core
#SBATCH --partition=shas         #We want to use a Haswell compute node
#SBATCH --qos=debut              #We want to be in a debug queue
#SBATCH --time=0:01:00           #We expect this job should take at most a minute. 
#SBATCH --output=hello_world_output_%j.txt       #We want to name any output files "hello_world_output_%j.txt" where <%j> will input the job's number in the name.

The new code could be sutmitted as a job like this…

$ sbatch printHelloWorld.sh

Find the output now.

For more information about sbatch commands, see batch-queuing and slurm .


:!: Independent Exercise:

  • Write a shell script called startProject.sh that does the following:
    • makes a directory called 00_README
    • makes a directory called 01_INPUT
    • makes a directory called 02_SCRIPTS
    • makes a directory called 03_OUTPUT
    • makes a file in 00_README called readme_notes.txt
  • Write #SBATCH preambles for your startProject.sh shell script that does the following:
    • names the job project
    • requests 1 node
    • requests 1 ntask
    • requests a shas partition
    • requests a regular quality of service (qos)
    • expects a time of 0:02:00
    • specifies an output file with the job ID in the title
  • Execute your job using the command sbatch startProject.sh

Supercomputing demonstration

wiki/supercomputing2.txt · Last modified: 2017/08/30 17:11 by erin