Automatic Compilation for Energy Efficient Iterative Stencil Computations

Stencil Computation (SC) is an important class of programs that occurs in a variety of scientific applications, including Partial Differential Equation Solvers, Computational Fluid Dynamics , Computational Electromagnetic applications using Finite Difference Time Domain method .

In this project, we develop a code generator for SCs that targets for energy efficiency on a single multi-core processor. We target for two main components: the static energy consumption and the main memory system energy consumption. The static energy consumption is propotional to the execution time and the main memory energy consumption is propotional to the number of main memory accesses. Therefore, our strategy seeks to the minimize the total energy consumption by first optimize the performance to minimize the static energy consumption, and then further optimize the memory accesses without sacrificing the high performance.

Our code generator takes a polyhedral specification of SC, and generates the corresponding C+OpenMP program. Our code gnerator integerates many optimizations for improving performance, including tiling, vectorization, temporary buffering and registering reuse. A flattened multi-pass parallelization strategy is applied to further improve the off-chip memory access to minimize the overall energy consumption.

Automatic Detection and Parallelization for Scans and Reductions

A Scan (prefix Computation) is a fundamental block for many algorithms, including sorting algorithms and computational geometry algorithms. A scan is an operation which takes a binary associative operator and a sequence of expressions, then accumulates all the expressions one after another and saves all the intermediate values as an output.

We developed a compiler technique that automatic detects the prefix computations and generated parallelized code for it. Our scan detection technique is based on extracting matrix vector multiplication forms with semi-rings. Our code generator supports shared memory parallelization and the generated code demonstrated linear scalability on multicore machines.

Compilation for Hierarchical Polyhedral Programs

Moudlarity in polyhedral model is introduced by Dupont de Dinechin in 1995. However, the code generation problem for modular polyhedral model has not been addressed yet. We developed a code generator for structured polyehdral programs. Our code generator generates structured parallel C programs for multicore architectures. Different optimization options are supported to generate efficient codes.


Our research group develops and maintains a system called AlphaZ, which is a program transformation and code generation framework for polyhedral programs. The system takes a polyhedral specification (Alphabets program ) for a program and generates sequential and parallel (OpenMP and MPI) C program for the specified program. The implementations of the above projects are all integrated in this system.

AlphaZ is an open source software that is developed in Java, depends on Eclipse and Model Driven Engineering (MDE) Plugins. This system is currently only supported in Linux and MacOS for both 32 and 64-bit machines.

Useful Links

Intel Intrinsics Guide
Performance Application Programming Interface (PAPI)
Polyhedral Code Generators: PLUTO , Pochoir,