The objective of this homework is to write three OpenMP programs,
to debug and test them on a Capital machine, and to experimentally determine the gains you
get in running it in parallel with 1, 2, 3, 4, 5, 6, 7, and 8 threads.
The parallelizations are relatively simple, and the results should be
interesting in terms of speedup. You should measure and plot the
performance of your parallelization as a function of the number
of threads, and analyze your observations.
Can you see a correspondence between memory (re)use and max. speedup?
- 1. Jacobi Stencil 1D
Parallelize the Jacobi stencil computation from the provided
- 2. Jacobi Stencil 2D
This is a 2D extension of the previous program; the data is updated
using the values of four neighboring elements.
- 3. Matrix-vector Product
Parallelize the provided sequential program for the Matrix-vector product.
Download and untar this tarball
The input usage for each program is
- jacobi_1D 4000 200000
- jacobi_2D 800 2000
- mat_vec 15000 10000
What to submit
Codes and report tarred up in one file PA1.tar.