Introduction

The objective of this PA is to write an MPI program for the one D Jacobi problem from PA1, using a blocked approach: the array partitions in each MPI process are surrounded by ghost element buffers, as described in Quinn Section 13.3.5. and the slides from that chapter. You will write the MPI code, and analyse its performance on the Cray. The provided file jac1.c contains the sequential (one node) part of the code. When executed on a department machine:
 mpicc -o jac1 jac1.c -O3
 mpirun -np 1 jac1 12 6 1 0
it produces
0.000000 0.987654 1.939643 2.794239 3.456790 3.823045 3.823045 3.456790 2.794239 1.939643 0.987654 0.000000 
Sequential process complete, time: 0.0000257

Create a Makefile. Notice that it may change when you go from the Veggie machines to the Cray. Experiment with buffer sizes k, for problem size n = 3200 and number of iterations m = 32000 on the Cray using 1, 2, 4, 8, and 16 nodes, e.g.,

wbohm@cray2:~/lustrefs> aprun -n 1 jac1 3200 32000 1
uses one node on the Cray.

Submit your Cray Makefile, MPI code, your Cray time and speedup statistics, and observations in a README file in PA4.tar.