Data Flow Computational Models

A brief overview and history

A dataflow program is a graph, where nodes represent operations and edges represent data paths. Dataflow is a distributed model of computation: there is no single locus of control. It is asynchronous: execution ("firing") of a node starts when (matching) data is available at a node's input ports. In the original dataflow models, data tokens are consumed when the node fires. Some models were extended with "sticky" tokens: tokens that stay - much like a constant input - and match with tokens arriving on other inputs. Nodes can have varying granularity, from instructions to functions.

The history of dataflow goes back to the sixties: Petri (1962) invented Petri nets, Estrin and Turn (1963) proposed an early dataflow model. Karp and Miller (1966) studied computation graphs without branches or merges, Rodriguez (1969) extended and formalized Estrin's model. Chamberlain (1971) proposed a single assignment language for dataflow. Kahn (1974) studied a simple parallel process language with queues as edges. Dennis (1974) designed a dataflow architecture, where edges were one token buffers. Arvind and Gostelow, and separately Gurd and Watson (1977) proposed a "tagged token" dataflow model, where the edges behave like bags.

In the Cameron project, data flow graphs are used as an intermediate representation between the algorithmic SA-C programming language and circuit-level FPGA configurations. Single assignment semantics allow us to map SA-C variables to edges in a dataflow graph, while primitive operations in SA-C map to nodes. The dataflow graph makes data dependencies explicit, and is a convenient representation for many compiler optimizations. Dataflow graphs are then mapped to VHDL circuits by mapping edges onto wires and nodes onto VHDL components.