Tiling is an important program transformation that is used to improve data locality and parallelization granularity. We provide a tiled code generator that produces tiled code with wavefront parallelization supported for the shared memory machine.

Let's use the classic matrix multiplication as an illustration example, whose alphabets code is the following:

affine matrix_product {P, Q, R|P>0 && Q>0 && R>0} given float A {i,k| 0<=i<P && 0<=k<Q}; float B {k,j| 0<=k<Q && 0<=j<R}; returns float C {i,j,k| 0<=i<P && 0<=j<R && k==Q}; using float temp_C {i,j,k|0<=i<P && 0<=j<R && 0<=k<=Q}; through temp_C[i,j,k] = case {|k>0} : temp_C[i,j,k-1] + A[i,k-1]*B[k-1,j]; {|k==0} : 0; esac; C = temp_C; .

The tiled code generator is called through ScheduledC, therefore, the first step for the code generation is the same with ScheduledC – specify spacetime map and memory map.