tutorial_lud

**This is an old revision of the document!**

In this tutorial, we write an Alphabets (or Alpha, for now the two are synonymous) program, starting from a mathematical equation for LU decomposition. Then we will generate code to execute the alpha program, and test the generated code for correctness.

The equation for LU Decomposition, derived from first principles using simple algebra in Foundations (pg.3), is as follows: $$ U_{i,j}=\begin{cases} 1=i\le j & A_{i,j}\\ 1<i\le j & A_{i,j}-\sum_{k=1}^{i-1}L_{i,k}U_{k,j} \end{cases}\\ L_{i,j}=\begin{cases} 1 = i\le j & \frac{A_{i,j}}{U_{j,j}}\\ 1< i\le j & \frac{1}{U_{j,j}}(A_{i,j}-\sum_{k=1}^{j-1}L_{i,k}U_{k,j}) \end{cases} $$

[Temp note due to : in the last case of L, the condition is “1 < j ⇐ i”]

Let's start from an empty alpha file, with LUD as the name of the system, and a positive integer N as its parameter. A system (Affine System) takes its name from system of affine recurrence equations, and represents a block of computation. An Alpha program may contain multiple systems.

**Caveat:** Remember the phrase, “It's not a bug, it's a feature”? Well, in a tutorial, a feature is called a “learning opportunity.”

Parameters are runtime constants represented with some symbol in the code. In this example, parameter N will be used to define the size of the matrices, which is not known until runtime.

affine LUD {N|N>0} .

In most cases, a computation uses some inputs and produces outputs. Such variables must be declared with a name, a data type, and a shape/size. In Alpha, the shape/size is represented with polyhedral domains.
For this example, the `A`

matrix is given, and we are computing two triangular matrices `L`

and `U`

. `A`

is an NxN square matrix. The declaration for `A`

looks as follows:

float A {i,j|1<=(i,j)<=N}; //starting from 1 to be consistent with the equation in the notesSimilarly,

`L`

is a lower triangular matrix of size N (with unit diagonals, implicit) and `U`

is an upper triangular matrix of size N. The declarations should look like the following:
// The convention is that i is the vertical axis going down, and j is the horizontal axis float L {i,j|1<i<=N && 1<=j<i}; // Note that the diagonal elements of L are not explicitly declared float U {i,j|1<=j<=N && 1<=i<=j};Now these variable declarations need to be placed at appropriate places to specify whether they are input/output/local.

`input`

/`given`

is the keyword for input, `output`

/`returns`

is the keyword for output, and `local`

/`using`

is the keyword for local variables.
affine LUD {N|N>0} input float A {i,j|1<=(i,j)<=N}; output float L {i,j|1<=j<i<=N}; float U {i,j|1<=i<=j<=N}; .

Polyhedral domains are represented as { “index names” | “affine constraints using indices and parameters” }, where constraints can be intersected with `“&&”`

. Sometimes constraints can be expressed with short-hand notation like `“a<b<c”`

or `“(b,c)<0”`

.

Unions of such domains can be expressed as “{ a,b | constraints on a and b } || { c,d | constraints on c and d }”. One important point about Alphabets domains is that the names given to indices are only for textual representation. Internally, all analysis/transformation/code generation tools only care about which dimension the constraint applies to.
For example, a domain { i,j | 0⇐i<j<N } is equivalent to { x,y | 0⇐x<y<N }, because `i`

and `x`

are both names given to the first dimension, and `j`

and `y`

are names given to the second dimension.

Now the only remaining step before a complete Alphabets program is writing the equations. After a little experience, the connection from mathematical equations (of a certain form) to Alphabets equations should become increasingly clear. There are two slightly different syntactic conventions for writing equations, one is called the “Show syntax” and the other is called “AShow syntax”. Show syntax is closer to the internal representation of Alphabets programs, and is more expressive when writing complex programs. AShow syntax uses “array notation” so that it is easier for people used to imperative programs.

We will first write the equation for `U`

in AShow syntax, and then move on to Show as we write the equation for `L`

.

In this equation, `U`

is on the left hand side, and the right hand side should define `U`

for each point in the declared domain of the `U`

variable.
In AShow syntax, the names for indices used appear on the LHS of the equations.

For this example, the following LHS for `U`

gives `i, j`

as the names for the first and second dimensions to be used when writing the expressions in the RHS.
These names do not have to match the names used in variable declaration. You could use `x,y`

instead of `i,j`

if desired.

U[i,j] = RHSexpr;

The first thing you notice in the definition of `U`

in the mathematical equation is the branch based on values of `i`

and `j`

. This branching is expressed with CaseExpression in Alphabets.
A CaseExpression starts with the keyword “`case`

”, ends with keyword “`esac`

”, and has list of “`;`

”-delimited expressions, called “clauses” as its children.
Often (but not always), each child of a case is a RestrictExpression (whose syntax is “domain : expr”), which restricts the domain to the specified domain.

Using the above expressions, the branching of the definition of `U`

is as follows :

U[i,j] = case {|1==i} : expr1; {|1<i} : expr2; esac;Note that because index names are already declared in the context (equation LHS), there is nothing to the left of the

`|`

in the AShow syntax.
Moving on to the definitions in each case, the first case is . This is written as `A[i,j]`

in AShow syntax, similar to accessing an array. A variable without a square bracket, is treated either as a scalar variable (as in, `X[i,j] = 0`

) or as an access with the identity dependence function, (i.e., `X[i] = A[i]`

would be the same as `X[i] = A`

).

The last piece missing before completing the definition of `U`

is the summation in the second branch.
Mathematically, a reduction is an operation that applies an associative-comutative operator (in general, the operator may only be associative, but In Alphabets, we have only associative-comutative operators) to a set of values, such as summation (sum over a set of numbers).

Reductions are expressed with the following syntax :

reduce(operator, projection, expr);operator: operator to be applied (+, *, max, min, and, or)

expr: Any Alphabets expression. “Slices” of the result of evaluating this expression are combined using the reduction operator.

projection: The projection is an affine function with the same syntax as a dependence (as explained below). By its very nature, a reduction

In the mathematical equation, summation with one new index `k`

is used. For each value of `k,`

the expression `L[i,k]*U[k,j]`

is computed and added up to produce the result `U[i,j]`

. Thus, the projection function is `(i,j,k → i,j)`

. (from the three dimensional space indexed by `i,j,k`

, all values computed at `[i,j,k]`

are used to compute `U[i,j]`

in the two dimensional space indexed by `i,j`

– i.e., the `k`

is 'projected out')

When the projection function is canonic (e.g., `(i,j,k→i,j))`

, then the projection function can be replaced with a simpler syntax (AShow syntax for reductions) that specifies the names of new dimensions surrounded by square brackets.
For example, the projection `(i,j,x,y→i,j)`

can be expressed as `[x,y]`

.

Using the above, summation in the original equation can be written as the following Alphabets fragment.

reduce(+, [k], L[i,k]*U[k,j]);

Putting all this together, the final equation for `U`

is:

U[i,j] = case {|1==i} : A[i,j]; {|1<i} : A[i,j] - reduce(+, [k], L[i,k]*U[k,j]); esac;

This is exactly like the original equation Caveat

Now we will write the equation for `L`

, but this time in Show syntax. Unlike the AShow syntax, Show syntax does not rely on the context for naming of indices. Index names can be different in every (sub)expression if it makes sense to do so.
Because of this, the LHS does not have square brackets, all we need is the variable name.

L = RHSexpr; //Show syntax

CaseExpression and RestrictExpression are same as AShow syntax. However, since index names are no longer deduced from the context where they occur, they must be explicitly named everywhere. While this may seem cumbersome, it allows expressions to have compositional semantics. In our example, the index names used in the domain of RestrictExpression have to be made explicit.
The branch in the definition of `L`

becomes the following Alphabets :

L = case {i,j|1==j} : expr1; {i,j|1<i} : expr2; esac;

In the array notation in AShow syntax, a DependenceExpression was implicit: just add expressions within square brackets to access variables). In the Show syntax `DependenceExpression`

is used to explicitly specify which value of a variable is required for a computation. The syntax of `DependenceExpression`

is “(affine_function)@expr”, where `affine_function`

is of the form `(list_of_indices → list_of_affine_expressions)`

. For example, the dependence `(i,j→i-1,i+j)@A`

means that at index point `(i,j)`

this computation evaluates to the value of `A`

at index point `(i-1,i+j)`

.

The child of `DependenceExpression`

can be any Alphabets expression, possibly another DependenceExpression. For example, `(i,j→i,j,i+j,0)@(a,b,c,d→a,c-a)@A`

is a perfectly legal Alphabets expression.

Reductions in Show syntax are exactly like in the AShow syntax, except that the projection function is specified in the dependence syntax. This is all you need in order to write the rest in of the equation in Show syntax.

L = case {i,j|1==j} : (A / (i,j->j,j)@U); {i,j|1<i} : (A - reduce(+, (i,j,k->i,j), (i,j,k->i,k)@L*(i,j,k->k,j)@U))/(i,j->j,j)@U; esac;

Combine all of the above, and you will get the Alphabets program for LU decomposition. Don't forget the keyword `let`

/`through`

before equations the period at the end (since our example has no local variables). Notice how we can mix and match Show and AShow syntax within the program, but each equation must obviously, be consistent.

affine LUD {N|N>0} input float A {i,j|1<=(i,j)<=N}; output float L {i,j|1<i<=N && 1<=j<i}; float U {i,j|1<=j<=N && 1<=i<=j}; let U[i,j] = case {|1==i} : A[i,j]; {|1<i} : A[i,j] - reduce(+, [k], L[i,k]*U[k,j]); esac; L = case {i,j|1==j} : A / (i,j->j,j)@U; {i,j|1<j} : (A - reduce(+, (i,j,k->i,j), (i,j,k->i,k)@L*(i,j,k->k,j)@U))/(i,j->j,j)@U; esac; .

Analyses, transformations, and code generation of Alphabets programs are performed using the AlphaZ system. The normal interface for using AlphaZ is the scripting interface called compiler scripts. Given below is an example script for that does several things using the LUD program we wrote above.

# read program and store the internal representation in variable prog prog = ReadAlphabets("./LUD.ab"); # store string (corresponding to system name) to variable system system = "LUD"; # store output directory name to variable outDir outDir = "./test-out/"+system; # print out the program using Show syntax Show(prog); # print out the program using AShow syntax AShow(prog); # prints out the AST of the program (commented out) #PrintAST(prog); # generate codes (this is demand-driven, memoized code) generateWriteC(prog, system, outDir); generateWrapper(prog, system, outDir); generateMakefile(prog, system, outDir);Save this script with .cs extension, place the alphabets file in the same directory as the script, and then right click on the editor and select

`“Run As → Compiler Script”`

to run the script.
If you get some error message, try looking at the first line of the error messages to find out what it is about. Common problems are:

- not having the file in the right place (you will see
`FileNotFoundException`

in this case) - system name does not match the any of the system in the program (error message should say system
`xxx`

does not exist)

In this tutorial, we use two basic code generators, without going into too much detail. The two types of codes generated are `WriteC`

and `Wrapper`

.

`WriteC`

code may not be efficient, but it can be generated without any additional specification beyond the program.

`Wrapper`

code is a wrapper around other generated codes that allocates/frees memory for input and output variables, and it also have different options for testing.

Note:Current implementation of the Wrapper prints out the bounding box of the domain of the output variable.

generateMakefile produces a Makefile that should compile the generated codes. You can make with different options.

Congratulations!! You are nearly at the end. Now, you will actually make and execute the code (in a separate terminal window).

compiles the code and produces an executable `xxx`

(where `xxx`

is the system name) that executes the program with default input that is 1 everywhere. Compiling with this option does not test very much, but it will test if it compiles and runs and produces no errors.

Compiles the code and produces an executable `xxx.check`

(`xxx`

is the system name) that prompts the user for all values of input variables.
After executing, it prints out all values of the output variables.
This option should be used for testing small to mid-sized input data.

Compiles the code with another code named `xxx_verify.c`

that defines a function `xxx_verify`

(`xxx`

is the system name).
Users can provide different program as `xxx_verify`

to compare outputs.

Same as verify, except the inputs are generated randomly.

You will see that when you execute the code, ** it will produce an error**. You may be able to easily fix the error in your Alpha program and regenerate correctly executing C code, or you may want a bit of help. In either case, we would like to know. Please email Sanjay.Rajopadhye@colostate.edu with the error message that is produced.

tutorial_lud.1529444741.txt.gz · Last modified: 2018/06/19 15:45 by sanjay