Top Up Prev Next Bottom Contents Index Search

9.2 Process networks


Kahn describes a model of computation where processes are connected by communication channels to form a network [Kah74,Kah77]. Processes produce data elements or tokens and send them along a unidirectional communication channel where they are stored in a first-in first-out order until the destination process consumes them. Communication channels are the only method processes may use to exchange information. A set of processes that communicate through a network of first-in first-out queues defines a program.

Kahn requires that execution of a process be suspended when it attempts to get data from an empty input channel. A process may not, for example, examine an input to test for the presence or absence of data. At any given point, a process is either enabled or it is blocked waiting for data on only one of its input channels: it cannot wait for data from one channel or another. Systems that obey Kahn's model are determinate: the history of tokens produced on the communication channels do not depend on the execution order [Kah74]. Therefore, we can apply different scheduling algorithms without affecting the results produced by executing a program.

9.2.1 Dataflow process networks

Dataflow is a model of computation that is a special case of process networks. Instead of using the blocking read semantics of Kahn process networks, dataflow actors have firing rules. These firing rules specify what tokens must be available at the inputs for the actor to fire. When an actor fires, it consumes some finite number of input tokens and produces some finite number of output tokens. For example, when applied to an infinite input stream a firing function f may consume just one token and produce one output token:

To produce an infinite output stream, the actor must be fired repeatedly. A process formed from repeated firings of a dataflow actor is called a dataflow process [Lee95]. The higher-order function map converts an actor firing function f into a process:

A higher-order function takes a function as an argument and returns another function. When the function returned by

is applied to the input stream
the result is a stream in which the firing function f is applied point-wise to each element of the input stream. The map function can also be described recursively using the stream-building function cons, which inserts an element at the head of a stream:

The use of map can be generalized so that f can consume and produce multiple tokens on multiple streams [Lee95].

Breaking a process down into smaller units of execution, such as dataflow actor firings, makes efficient implementations of process networks possible. The SDF, BDF, and DDF domains implement dataflow process networks by scheduling the firings of dataflow actors. The actor firings of one dataflow process are interleaved with the firings of other processes in a sequence that guarantees the availability of tokens required for each firing. In the PN domain, a dataflow process is created for each dataflow actor. A separate thread of execution is created for each process, and the interleaving of threads is performed automatically. Unlike the dataflow domains, the firing of a dataflow actor is not an atomic operation in the PN domain. Because the scheduler does not guarantee the availability of tokens, the firing of an actor can be suspended if it attempts to read data from an empty input channel.

9.2.2 Scheduling dataflow process networks

Because Kahn process networks are determinate, the results produced by executing a program are unaffected by the order in which operations are carried out. In particular, deadlock is a property of the program itself and does not depend on the details of scheduling. Buffer sizes for the communication channels, on the other hand, do depend on the order in which read and write operations are carried out.

For Kahn process networks, no finite-time algorithm can decide whether or not a program will terminate or require bounded buffering. Since we are interested in programs that will never terminate, a scheduler has infinite time to decide these questions. Parks [Par95] developed a scheduling policy that will execute arbitrary Kahn process networks forever with bounded buffering when possible. To enforce bounded buffering, Parks limits channel capacities, which places additional restrictions on the order of read and write operations. Parks reduces the set of possible execution orders to those where the buffer sizes never exceed the capacity limits. In this approach, execution of the entire program comes to a stop each time we encounter artificial deadlock, which can severely limit parallelism. Artificial deadlock occurs when the capacity limits are set too low, causing some processes to block when writing to a full channel. All scheduling decisions are made dynamically during execution.

9.2.3 Iterations in the PN domain

In a complete execution of a program, the program terminates if and only if all processes block attempting to consume data from empty communication channels. Often, it is desirable to have a partial execution of a process network. An iteration in the PN domain is defined such that no actor in a dataflow process will fire more than once. Some actors may not fire, or may fire partially in an iteration if insufficient tokens are available on their inputs.



Top Up Prev Next Bottom Contents Index Search

Copyright © 1990-1997, University of California. All rights reserved.