CS553 Colorado State University =============================================== Presburger Transformation Framework =============================================== 11/10/09 -------- --------------------- Intro slide Kelly and Pugh transformation framework can represent unimodular transformations. Polyhedral model and presburger models are almost equivalent. - Both are able to represent tiling with extensions. - Presburger model enables a scheduling relation versus a scheduling function, which enables mapping one computation to multiple processors. - Research problem: How do we do code generation for something like that? ------------------ Loop Fusion Example (slide 2-5) Work through example to build intuition of what loop fusion is and when it is legal. Can we specify as a unimodular transformation? Can we represent the dependences with distance vectors? If not, how does the compiler determine when it is legal? -> write code on the board for use in slide 6 ------------------- Presburger/Kelly and Pugh Transformation Framework (slide 6) Example for (i=1; i<=n; i++) { for (j=1; j<=n; j++) { A[i][j] = A[i-1][j-1] + sin(i+j); } } Notice that the data dependence relation is just another way to specify the data dependence problem. Determining if the data dependence mapping is satisfiable determines whether a dependences exist or not. If the difference between the source and target tuple can be represented with a constant vector then the dependence can be represented with the simpler distance vector representation. With unimodular transformations, the new iterators can only be expressed as a function of a unimodular matrix multiplied by the old iteration vector ------------------- Loop Fusion in Presburger/Kelly and Pugh Transformation Framework (slides 7-8) How do we specify loop fusion? - describe iteration space, data dependences, and transformation mapping Checking legality of loop fusion. -> Show that current dependences are lexicographically non-negative. -> Apply transformation to each of the dependences and show that they are lexicographically non-negative. Notice that we can also represent instruction scheduling in this framework. [1] -> [3], etc. Since the granularity is at the statement level, we can't really represent communication requirements when parallelizing. ------------------------ Loop Fusion Example (cont) (slide 9) Now let's do Reversal on all the loops Reversal on a single loop i i' N -> 1 N-1 -> 2 ... 2 -> N-1 1 -> N i' = N-i+1 Reversal and fusion transformation specification for the three loops. T_1 = { [1,i_1,1] -> [1,i_1',1] | i_1' = n - i_1 + 1 } T_2 = { [2,i_2,1] -> [1,i_2',2] | i_2' = n - i_2 + 1 } T_3 = { [3,i_3,1] -> [1,i_3',3] | i_3' = n - i_3 + 1 } Legality? --------------------- Loop Fission (slides 12-15) Split them into groups of two and three, - specify the original iteration space in K&P framework - specify the data dependence - write the transformation mapping for fission -> As a class check legality of loop fission transformation --------------------- Loop Unrolling (slides 16-20) It is not clear out to represent loop unrolling in the transformation frameworks. Would this work? {[i,0] -> [j,0] : 2j = i } union {[i,0] -> [j,1] : 2j + 1 = i } How could we do code generation for it? Another option is to specifying strip-mining, which enables an unrolling post pass after code generation. -------------------- mstrout@cs.colostate.edu, 11/10/09