The objective of this homework is to write three OpenMP programs, to debug and test them on a Capital machine, and to experimentally determine the gains you get in running it in parallel with 1, 2, 3, 4, 5, 6, 7, and 8 threads. The parallelizations are relatively simple, and the results should be interesting in terms of speedup. You should measure and plot the performance of your parallelization as a function of the number of threads, and analyze your observations. Can you see a correspondence between memory (re)use and max. speedup?

  • 1. Jacobi Stencil 1D
    Parallelize the Jacobi stencil computation from the provided sequential code.
  • 2. Jacobi Stencil 2D
    This is a 2D extension of the previous program; the data is updated using the values of four neighboring elements.
  • 3. Matrix-vector Product
    Parallelize the provided sequential program for the Matrix-vector product.

Provided Code

Download and untar this tarball The input usage for each program is
  • jacobi_1D 4000 200000
  • jacobi_2D 800 2000
  • mat_vec 15000 10000

What to submit

Codes and report tarred up in one file PA1.tar.