This lab session is intended to show you how run an OMP code and to analyze and graph the speedup of parallel code. We will be using the Pi Reduction code from Quinn, section 17.6.
Download and untar the provided piRed.tar file. Study the makefile. Study the piRed.c code. Study the way it times the program. This is an OMP program; study the parallelized for loop.
Find a list of the veggie machines (~info/machines). ssh to a veggie machine (e.g. ssh pea.cs.colostate.edu or carrot or tomato...). These machines have two CPUs with four cores each. Do not all use the same machine. Find a quiet machine; use "who" to find out whther there are other users, or "top" to find out what the processor is doing.
Compile (using the makefile) and run
piRed 1000000000 (9 0-s)
for p = 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, and 16 threads, using
setenv OMP_NUM_THREADS p
Run each case five times. Put all your raw performance data in a "resultsMachinename.txt", e.g.:
.... pea 202 # setenv OMP_NUM_THREADS 1 pea 203 # piRed 1000000000 Thread 0 : Number of threads = 1 Pi est = 3.141592653589971, Time : 18.064747 sec pea 204 # piRed 1000000000 Thread 0 : Number of threads = 1 Pi est = 3.141592653589971, Time : 18.074043 sec pea 205 # piRed 1000000000 Thread 0 : Number of threads = 1 Pi est = 3.141592653589971, Time : 18.073313 sec ... ...
What do you observe? What does this mean about the timing data, i.e. which decimal digits are significant?
Then sumarize your data using the median of your five experiments, using only rounded significant decimal digits, e.g.:
p | Tp 1 18.1 2 9.2 ... ...
Then build a table with a row for each p containing
Plot the number of threads, time, speedup and efficiency. You can use any plotting tool, or plot by hand. What do you observe? Can you explain your observations?
In discussion 1 you will report your results and discuss your observations with your group.