CS475 Lab 1: Mandelbrot: Measuring, Analyzing, Reporting Performance

This lab session is intended to show you how to run an OMP code and to analyze and graph the speedup of parallel code. This lab session will show you how to effectively run HPC experments without interference of external variables such as a busy machine or or varying input sizes. We will be using Mandelbrot code, which produces a fractal.

1. Retrieve source code and verify that you can compile, run, and control the number of threads.

A quick note for Mac users: OpenMP
If you want work on a Mac at home, you need a version of gcc compiled with openMP support: You can go to hpc.sourceforge.net and download gcc 4.7 4.8 4.9 depending on your OS version. You will have to extract the archive and update the PATH variable to include the new gcc. You still have to report the results for a state capital machine in 120-unix-lab or a ski area machine in the 225-unix lab at the cs department!

Log into one of the state capital machines or ski area machines.
1. Choose a machine from the list of machines:( machines list ).
2. Secure shell (ssh) into a capital machine or ski machine (e.g.
```
ssh
<yourname>@denver.cs.colostate.edu
```
  ).
3. Check to see how busy the machine is. You want to run on a quiet machine. The capitol machines have one CPU with eight hyperthreaded cores and the ski machines have one CPU with six hyperthreaded cores. Use
```
who
```
  to find out whether there are other users, or
```
top
```
  to find out what the processor is doing.
Download and untar the provided mand.tar file.
Study the makefile and the
```
mandelbrot.c
```
code. Study the way it times the program. This is an OpenMP program; study the parallelized for loop.
Compile (using the makefile) and run
```
mandSEQ 1000
```
```
mandOMP 1000
```
Vary the number of threads being used: for p = 1, 2, 3, 4, 5, 6, 7, and 8.
csh:
```
setenv OMP_NUM_THREADS p
```
bash:
```
export OMP_NUM_THREADS=p
```

2. Collect Parallel Performance Data

Run each case 7 times.
Record execution times in a file: "resultsMachinename.txt"
Analyze your data and prepare for discussion:

Contrast the execution times for mandSEQ and mandOMP for OMP_NUM_THREADS=1. What is the difference between mandSEQ and mandOMP for OMP_NUM_THREADS=1? Write down an explaination of what you see and why you think it is happening.
Make observations about the collected data including:
How many decimal digits are significant?
How variable are the execution times?
To summarize the data we will remove the extreme values (min/max) and average the remaining five runs. You may want to check the variance of each observation (make sure the deviation from the mean is below the 2% threshold). You may want to write some kind of a script to assist in compiling averages.
Build a table with a row for each p containing the speedups you see for p=1, 2, ... , 8.

(average) time in seconds in significant decimal digits or
speedup or
efficiency

Plot time, speedup, or efficiency against the number of threads. You can use any plotting tool, or plot by hand. What do you observe? Can you explain your observations?

3. Scheduling of Parallel Mandelbrot

In section 2 above you probably did not get perfect speedup, not even linear speedup! Convince yourself that even though the outer loops of Mandelbrot are very nice and regular, the inner loop IS NOT! Study the mandelbrot image.

What does a certain pixel value tell you about the number of iterations in the inner loop?
Look at the fractal image; where is most of the work done?
What does that say about the amount of work the various threads have to do? (hint: Think about 2 vs 3 threads.)
Find a better scheduling strategy, (check out dynamic scheduling) and plot your new results.