[<< home] [Lab 1] [Lab 2] [Lab 3] |

- Review the slides here from the lecture #1.
- Review the slides here from the lecture #2.
- Review the slides here from the lecture #3.
- Review the slides here from the tutorial on ISCC (by Sven Verdoolaege).

*Note: students will present their solution for the exercices on the
computer during the class.*

You must first install `iscc`

, that you can find in
the Barvinok
package.

For the following code, write the ISCC program in order to:

- Represent the iteration domain of the statement.
- Represent the data space of the program, for each array.
- Count the number of points in the iteration domain.
- Count the number of points in the data space.
- Generate code to scan the iteration domain.
- Generate code to scan the data space of the array
`A`

.

```
```for (i = 0; i < N; ++i)
for (j = 0; j < N; ++j)
for (k = 0; k < N; ++k)
C[i][j] += A[i][k] * B[k][j];

- Review the slides here from the lecture #3.
- Review the slides here from the previous lecture.
- Review the slides here from the tutorial on ISCC (by Sven Verdoolaege).

*Note: students will present their solution for the exercices on the
computer during the class.*

Write an algorithm that creates automatically the (sequence of)
transformations, in the form of a scattering function, to tile a loop
nest of a given depth. The algorithm takes as an input (1) the number
of loops `d`

in a perfectly nested, permutable loop nest;
(2) a tile size (scalar) `s`

; and (3) a scattering function
that maps some input space into a 2d+1 output space.

Apply the algorithm of the previous exercise to compute the tiling
transformation for matrix-multiply, where the input scattering
permutes the loops `i`

and `k`

.

```
```for (i = 0; i < N; ++i)
for (j = 0; j < N; ++j)
for (k = 0; k < N; ++k)
C[i][j] += A[i][k] * B[k][j];

- Review the slides here from the previous lecture.
- Review the slides here from the tutorial on ISCC (by Sven Verdoolaege).
- Read the paper from Baskaran et al. Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories
- Read the paper from Alias et al. Program Analysis and Source-Level Communication Optimizations for High-Level Synthesis

*Note: students will present their solution for the exercises on the
computer during the class.*

Write the algorithm(s) to automatically create and place communications from main memory to a local buffer, for a program that is perfectly nested and contains only permutable loops. The program has to be tiled by your algorithm, and the communications must be at the granularity of a tile (that is, all elements required by a tile are copied before a tile executes).

Your algorithm must:

- Tile the original perfectly-nested loop nest
(having
`d`

loops), using square tiles of size`Ts`

- Insert copy-in code for all read references
- Insert copy-out code for all written references
- Update the tile body to correctly index the local buffers
- Compute the buffer sizes

Apply the algorithm of the previous exercise to tile and compute communications for matrix-multiply.

```
```for (i = 0; i < N; ++i)
for (j = 0; j < N; ++j)
for (k = 0; k < N; ++k)
C[i][j] += A[i][k] * B[k][j];

Apply the algorithm of the previous exercise to tile and compute
communications for jacobi-1d. A pre-transformation is required to make
tiling legal: `{ s1[t,i] -> [t,2t+i,0]; s2[t,i] -> [t, 2t+i+1, 1]}`

.

```
```for (t = 0; t < tsteps; t++)
{
for (i = 1; i < n - 1; i++)
B[i] = 0.33333 * (A[i-1] + A[i] + A[i + 1]);
for (i = 1; i < n - 1; i++)
A[i] = B[i];
}