Introduction

The objective of this PA is to write an MPI program for the one-D Jacobi problem from PA1, using a blocked approach: the array partitions in each MPI process are surrounded by ghost element buffers, as described in Quinn Section 13.3.5. and the slides from that chapter. You will write the MPI code, and analyse its performance on the Cray.

The provided file jac1.c contains the sequential (one node) part of the code. When executed on a department machine:

      mpicc -o jac1 jac1.c -O3
      mpirun -np 1 jac1 12 6 1 0
      
it produces
0.000000 0.987654 1.939643 2.794239 3.456790 3.823045 3.823045 3.456790 2.794239 1.939643 0.987654 0.000000
Sequential process complete, time: 0.0000257

The arguments to the program are:

  arg1: n: problem (array) size
  arg2: m: number of iterations
  arg3: k: buffer size
  optional arg4: vp: 
           if not present: non-verbose, i.e. no debug info
           if present: the id of the verbose process, i.e. the process providing debug info

Create a Makefile. Notice that it may change when you go from the Capital machines to the Cray. Experiment with buffer sizes k = 1, 8, 16, and 32, for problem size n = 32000 and number of iterations m = 320000 on the Cray using 1, 2, 4, 8, 16, and 32 nodes, e.g.,

wbohm@cray2:~/lustrefs> aprun -n 1 jac1 32000 320000 1
uses one node on the Cray, and has a buffer size of 1. Notice that 32 nodes will cause the need for two Cray boards, and thus incurs inter board communication, whereas smaller numbers of nodes will only need intra board communication.

Submit your Makefile, MPI code, your Cray time and speedup statistics, and observations in a README file in PA4.tar.

You need to make sure that You have only one makefile that generates 2 binaries jacobi-cray and jacobi-dept that will be used respectively on the Cray and on the capital machines.

The output of the program need to have only 2 lines: One printing the value arcording to the vp argument.

 
if(v && vp == id ){	
	for(i=0;i < n/p;i++){		
		printf("%f ",prev[i]);		
	}	
	printf("\n");
}
One line, one space between each element, use '%f' to print the element.

The vp argument is used to print only the values computed by the process #vp. Dont print everything.

You will print n/p floats (=the values computed by #vp). Example:

mpirun -np 3 jac1 12 6 1 2
returns (among other lines)
2.794239 1.939643 0.987654 0.000000
mpirun -np 3 jac1 12 6 1 1
3.456790 3.823045 3.823045 3.456790
mpirun -np 3 jac1 12 6 1 0
0.000000 0.987654 1.939643 2.794239
The second line should give the time like that
Sequential process complete, time: 0.0000257
Important part is one line, ending with time: 'value in decimal floating point'

For the testcases, you can assume n/p is a integer, n/3 is an integer and k<=n/p