CS475 Lab 1: Mandelbrot: Measuring, Analyzing, Reporting Performance

This lab is intended to set you up for PA1 and this class in general. We will show you how to:

It will enable you to to run your HPC experiments without interference of external factors such as a busy machine or varying input sizes. It will also introduce you to the hybrid multi-core processors in our lab machines, explain, CPUs, cores and threads, and how to control their allocation to your execution.

We will use a simple, easily parallelizable program to compute the Mandelbrot set.

1. Retrieve source code, verify that you can compile, and, run it.

  1. Log (or ssh) into one of the fish machines machine (physically located in CSB 325, but you should be able to ssh or vpn to it if you are not in the building. The most current information on using the our machines is available here. These machines have a 24-core Intel processor with eight performance core (P-cores) and 16 efficient cores (E-cores). The two have very distinct properties and capabilities: vectorization, hyperthreading, frequency, etc. Other machines are similar but different, and this is why we want you to use specifically these machines.
  2. Download the provided.tar file and untar it.
  3. Study
    1. the
      mandelbrot.c
      code: its single "
      pragma oml parallel for
      " pragma, its command line arguments, and what it prints.
    2. the makefile
    3. the script to gather performance data
  4. Compile (using the makefile) and run (the first argument tells it to run in "debug/verbose" mode, or to produce performance data in a csv format)
    mandSEQ 0 1000

    mandOMP 0 1000
  5. Vary the number of threads being used: for p = 1, 2, 3, 4, through 16.
    csh:
    setenv OMP_NUM_THREADS p

    bash:
    export OMP_NUM_THREADS=p

2. Collect Parallel Performance Data

  1. Run each case 7 times. We provided a simple script that you are encouraged to use and modify as you see fit. It produces output in .csv form so that you can then plot and visualize speedup in Excel or a similar tool. Down the road, our testing scripts will also expect such output.
  2. Record execution times in a file: (e.g.," data/mandelLab1.csv"
  3. Analyze your data and prepare for discussion:
    1. Contrast the execution times for mandSEQ and mandOMP for OMP_NUM_THREADS=1. What is the difference between mandSEQ and mandOMP for OMP_NUM_THREADS=1? Write down an explanation of what you see and why you think it is happening.
    2. Make observations about the collected data including:
      How many decimal digits are significant (see this blog). How variable are the execution times?
    3. To process the data,we remove the extreme values (min/max), and average the remaining five runs. You may want to check the variance of each observation (make sure the deviation from the mean is below the 2% threshold). You may want to: consider adjusting the number of reps, and edit the script to assist in compiling the performance data and producing the information that you will need to submit in your reports.
    4. In your data analysis software (Excel or python, or ...), build/compute a table with a row for each p with
    5. Plot the speedup against the number of threads. You can use any plotting tool, or plot by hand. What do you observe? Can you explain your observations?

4. Analyze and improve the parallelization (discuss with us)

You most likely did not get perfect, or even linear speedup. The outer two loops of Mandelbrot are very nice and regular (each pixel can be computed independently of the others), but the innermost loop IS NOT! Study the mandelbrot image. Consider the following points

  1. What does the color (intensity) of a pixel value tell you about the number of iterations it took to find that value?
  2. Look at the fractal image; where is most of the work done?
  3. What does that say about the work the threads have to do? Does each thread do the same amount of work (hint: think about 2 vs 3 threads)?
  4. Find a better parallelization strategy (check out the schedule clause), confirm with us, and include these results in your report. If you are stuck, please see the CSx75 team (TA+instructor). We want you to successfully complete the lab.

    4. Write and submit your report using the Checkin tab

    Submit a pdf file using the Checkin system on the cs fish machines (we are not doing submission in Canvas, since we need to run our grading scripts on the cs department machines). We strongly recommend writing your report using Overleaf or some similar package. Please make sure that you

    Notes

    Delete this para?

    A quick note for Mac users: OpenMP
    If you want work on a Mac at home, you need a version of gcc compiled with openMP support: You can go to hpc.sourceforge.net and download gcc 4.7 4.8 4.9 depending on your OS version. You will have to extract the archive and update the PATH variable to include the new gcc. You still have to report the results for a fish machine in CSB Lab 325 the cs department!