Overview of the SA-C Compiler

Usage

The SA-C compiler provides one-step compilation from SA-C programs to adaptive computer systems. Our general model is of a programmer at a workstation, with access to an FPGA-based adaptive coprocessor. Programmers write programs for the coprocessor in SA-C, and debug and test them on their workstation. When they are ready, the SA-C compiler translates the source code from SA-C to an optimized FPGA configuration (or configurations). The compiler also generates the host code to download the configuration(s) onto the adaptive coprocessor, download the data and any run-time parameters, start the coprocessor, and upload the resulting data. As a result, programmers can treat the adaptive coprocessor like any other target machine, without knowing about circuit design or the intricacies of FPGAs.

There is also a second kind of user. These are users who know a great deal about FPGAs, and want to control the process of how their program is mapped onto a configuration. For these users, the compiler can generate files of the intermediate data flow graph representations, and these files can be viewed using the tools provided. Simulators (also provided) can simulate the execution of the graphs on data. Finally, the SA-C pragma mechanism allows experienced users to turn specific optimizations on or off, in order to generate a specific style of circuit.

More on SA-C Pragmas...


Data Flow Graphs

Internally, the SA-C optimizing compiler uses several intermediate representations to bridge the gap between high-level algorithmic programs and FPGA configurations. After parsing the SA-C code, it transforms the program into a data flow graph, which can be viewed as an abstract circuit representation (without clock signals or timing information). In particular, the nodes of a data flow graph are simple operations, such as addition, bit shift, or a memory read/write. The edges in the graph correspond to variables in the SA-C program, or wires in the eventual circuit. The data flow graph is therefore a convenient half-way point between source code and a circuit description. It is also an excellent representation for compiler optimizations, since all data dependencies are explicit and there are no side-effects (other than I/O).

Once the data flow graph is optimized, it is translated into VHDL. In particular, the SA-C compiler has a library of parameterized VHDL components that implement the nodes in the data flow graph. VHDL is generated by instantiating a component for every node in the data flow graph, and connecting the signals according to the edges. A commercial VHDL compiler is then used to translate the VHDL  circuit description into an FPGA configuration. (Currently, we use Synplicity followed by the Xilinx Foundation Tools.)

The compilation process is actually more involved then the preceding paragraphs might indicate. When SA-C source programs are initially translated into data flow graphs, they are translated into a type of graph called the data dependence and control flow (DDCF) graph. This graph has nodes corresponding to complex operators, such as sliding a window of data across an image. As this graph is optimized, the complex nodes are replaced by subgraphs of simpler nodes, until a more traditional data flow graph emerges. Traditional data flow graphs, however, do not have nodes with internal state, for example registers. Therefore yet another round of optimization and translation transforms the data flow graph into an abstract hardware architecture graph, which is a data flow graph with statefull nodes (mostly registers) and hand-shaking signals. The abstract hardware graph is then optimized one last time before VHDL is generated.

More on Data Flow Graphs...


Optimizations

The SA-C compiler applies its optimizations to the data flow graph representation of a program (where the "data flow graph" refers to everything from the DDCF graph to the abstract hardware graph). Many of the optimizations are traditional optimizations that can be found in almost any compiler. They include constant propagation/folding, common subexpression elimination, loop unrolling, and converting functions to lookup tables. Other optimizations are unique to the task of generating circuits from algorithms. They include pipelining, temporal common subexpression elimination, and window narrowing. In addition, the Synplicity compiler that translates VHDL to FPGA configurations does its own optimization pass.

More on Optimizations...