Some members only meaningful for CG domain are split to MultiTarget class and the CGMultiTarget class. If they are accessed from the parallel scheduler, some members are placed in MultiTarget class. Otherwise, they are placed in CGMultiTarget class (Note that this is the organization issue). Refer to the CGMultiTarget class for detailed descriptions.
MultiTarget(const char* name, const char* starclass, const char* desc);The arguments are the name of the target, the star class it supports, and the description text. The constructor hides
loopingLevel
parameter inherited from the CGTarget class since the parallel scheduler does no looping as of now.
IntState nprocs;This protected variable (or state) represents the number of processors. We can set this state, and also change the initial value, via the following public method:
void setTargets(int num);After child targets are created, the number of child targets is stored in the following protected member:
int nChildrenAlloc;There are three states, which are all protected, to choose a scheduling option.
IntState manualAssignment;If the first state is set to YES, we assign stars manually by setting
IntState oneStarOneProc;
IntState adjustSchedule;
procId
state of all stars. If oneStarOneProc
is set to YES, the parallel scheduler puts all invocations of a star into the same processor. Note that if manual scheduling is chosen, oneStarOneProc
is automatically set YES. The last state, adjustSchedule,
will be used to override the scheduling result manually. This feature has not been implemented yet. There are some public methods related to these states:
int assignManually();The first three methods query the current value of the states. The last method sets the current value of the
int getOSOPreq();
int overrideSchedule();
void setOSOPreq(int i);
oneStarOneProc
state to the argument value.There are two other states that are protected:
IntState sendTime;The first state indicates the communication cost to send a unit sample between nearest neighbor processors. If
IntState inheritProcessors;
inheritProcessors
is set to YES, we inherit the child targets from somewhere else by the following method.
int inheritChildTargets(Target* mtarget);This is a public method to inherit child targets from the argument target. If the number of processors is greater than the number of child targets of
mtarget,
this method returns FALSE with error message. Otherwise, it copies the pointer to the child targets of mtarget
as its child targets. If the number of processors is 1, we can use a single processor target as the argument. In this case, the argument target becomes the child target of this target.
void enforceInheritance();The first method sets the initial value of the
int inherited();
inheritProcessors
state while the second method gets the current value of the state.
void initState();Is a redefined public method to initialize the state and implements the precedence relation between states.
virtual DataFlowStar* createSpread() = 0;These methods are pure virtual methods to create Spread, Collect, Receive, and Send stars that are required for sub-universe generation. The last two method need three arguments to tell the source and the destination processors as well as the sample rate.
virtual DataFlowStar* createCollect() = 0;
virtual DataFlowStar* createReceive(int from, int to, int num) = 0;
virtual DataFlowStar* createSend(int from, int to, int num) = 0;
virtual void pairSendReceive(DataFlowStar* snd, DataFlowStar* rcv);This method pairs a Send,
snd,
and a Receive, rcv,
stars. In this base class, it does nothing.
virtual IntArray* candidateProcs(ParProcessors* procs, DataFlowStar* s);This method returns the array of candidate processors which can schedule the star
s.
The first argument is the current ParProcessors that tries to schedule the star . This class does nothing and returns NULL.
virtual Profile* manualSchedule(int count);This method is used when this target is inside a wormhole. This method determines the processor assignments of the Profile manually. The argument indicates the number of invocations of the wormhole.
virtual void saveCommPattern();These methods are used to manage the communication resources. This base class does nothing. The first method saves the current resource schedule, while the second method restores the saved schedule. The last method clears the resource schedule.
virtual void restoreCommPattern();
virtual void clearCommPattern();
virtual int scheduleComm(ParNode* node, int when, int limit = 0);This method schedules the argument communication node,
node,
available at when.
If the target can not schedule the node until limit,
return -1. If it can, return the schedule time. In this base class, just return the second argument, when,
indicating that the node is scheduled immediately after it is available to model a fully-connected interconnection of processors.
virtual ParNode* backComm(ParNode* node);For a given communication node, find a communication node scheduled just before the argument node on the same communication resource. In this base class, return NULL.
virtual void prepareSchedule();These two methods are called just before scheduling starts, and just before code generation starts, to do necessary tasks in the target class. They do nothing in this base class.
virtual void prepareCodeGen();
$PTOLEMY/src/domains/cg/targets
directory. It has a constructor with three argument like its base class, MultiTarget.
To specify child targets, this class has the following three states.
StringArrayState childType;The above states are all protected. The first state,
StringArrayState resources;
IntArrayState relTimeScales;
childType,
specifies the names of the child targets as a list of strings separated by a space. If the number of strings is fewer than the number of processors specified by nproc
parameter, the last entry of childType
is extended to the remaining processors. For example, if we set nproc
equal to 4 and childType
to be "default-CG56[2] default-CG96", then the first two child targets become "default-CG56" and the next two child targets become "default-CG96".The second state,
resources,
specifies special resources for child targets. If we say "0 XXX ; 3 YYY", the first child target (index 0) has XXX resource and the fourth child (index 3) has YYY resource. Here ';' is a delimeter. If a child target (index 0) has a resources
state already, XXX resource is appended to the state at the end. Note that we can not edit the states of child targets in the current pigi. If a star needs a special resource, the star designer should define resources
StringArrayState in the definition of the star. For example, a star S is created with resources
= YYY. Then, the star will be scheduled to the fourth child. One special resource is the target index. If resources
state of a star is set to "2", the star is scheduled to the third target (index 2).The third state indicates the relative computing speed of the processors. The number of entries in this state should be equal to the number of entries in
childType.
Since we specify the execution of a star with the number of cycles in the target for which the star is defined, we have to compensate the relative cycle time of processors in case of a heterogeneous target environment.Once we specify the child targets, we select a scheduler with appropriate options. States inherited from class MultiTarget are used to select the appropriate scheduling options. In the CGMultiTarget class, we have the following three states, all protected, to choose a scheduler unless the manual scheduling option is taken.
IntState ignoreIPC;The first state indicates whether we want to ignore communication overhead in scheduling or not. If it says YES, we select the Hu's Level Scheduler . If it says NO, we use the next state,
IntState overlapComm;
IntState useCluster;
overlapComm.
If this state says YES, we use the dynamic level scheduler . If it says No, we use the last state, useCluster.
If it says YES, we use the declustering algorithm . If it says NO, we again use the dynamic level scheduler. By default, we use the dynamic level scheduler by setting all states NO. Currently, we do not allow communication to be overlapped with computation. If more scheduling algorithms are implemented, we may need to introduce more parameters to choose those algorithms.There are other states that are also protected.
StringState filePrefix;Indicates the prefix of the file name generated for each processor. By default, it is set to "code_proc", thus creating code_proc0, code_proc1, etc for code files of child targets.
IntState ganttChart;If this state says YES (default), we display the Gantt chart of the scheduling result.
StringState logFile;Specifies the log file.
IntState amortizedComm;If this state is set to YES, we provide the necessary facilities to packetize samples for communication to reduce the communication overhead. These have not been used nor tested yet.
Now, we discuss the three basic methods:
setup,
run, wrapup.
void setup();(1) Based on the states, we create child targets and set them up:
prepareChildren.
virtual void prepareChildren();This method is protected. If the children are inherited, it does nothing. Otherwise, it clears the list of current child targets if they exist. Then, it creates new child targets by
createChild
method and give them a unique name using filePrefix
followed by the target index. This method also adjusts the resources
parameter of child targets with the resources
specified in this target: resourceInfo.
Finally, it initializes all child targets.
virtual Target* createChild(int index);This protected method creates a child target, determined by
childTypes,
by index.
virtual void resourceInfo();This method parses the
resources
state of this class and adjusts the resources
parameter of child targets. If no resources
parameter exists in a child target, it creates one.(2) Choose a scheduler based on the states:
chooseScheduler.
virtual void chooseScheduler();This is a protected method to choose a scheduler based on the states related to scheduling algorithms.
(3) If it is a heterogeneous target, we flatten the wormholes:
flattenWorm.
To represent a universe for heterogeneous targets, we manually partition the stars using wormholes: which stars are assigned to which target.
void flattenWorm();This method flattens wormholes recursively if the wormholes have a code generation domain inside.
(4) Set up the scheduler object. Clear
myCode
stream.(5) Initialize the flattened galaxy, and perform the parallel scheduling:
Target::setup.
(6) If the child targets are not inherited, display the Gantt chart if requested:
writeSchedule.
void writeSchedule();This public method displays a Gantt chart.
(7) If this target is inside a wormhole, it adjusts the sample rate of the wormhole ports (
CGTarget::adjustSampleRates),
generates code (generateCode),
and downloads and runs code in the target (CGTarget::wormLoadCode).
void generateCode();This is a redefined public method. If the number or processors is 1, just call
generateCode
of the child target and return. Otherwise, we first set the stop time, or the number of iteration, for child targets (beginIteration).
If the target is inside a wormhole, the stop time becomes -1 indicating it is an infinite loop. The next step is to generate wormhole interface code (wormInputCode,
wormOutCode if the target is inside a wormhole. Finally, we generate code for all child targets (ParScheduler::compileRun).
Note that we generate wormhole interface code before generating code for child targets since we can not intervene the code generation procedure of each child target once started.
void beginIteration(int repetitions, int depth);These are redefined protected methods. In the first method, we call
void endIteration(int repetitions, int depth);
setStopTime
to set up the stop time of child targets. We do nothing in the second method.
void setStopTime(double val);This method sets the stop time of the current target. If the child targets are not inherited, it also sets the stop time of the child targets.
void wormInputCode();These are all redefined public methods. The first two methods traverse the portholes of wormholes in the original graph, find out all portholes in sub-universes matched to each wormhole porthole, and generate wormhole interface code for the portholes. The complicated thing is that more than one ParNode is associated with a star and these ParNodes may be assigned to several processors. The last two methods are used when the number of processors is 1 since we then use
void wormOutputCode();
void wormInputCode(PortHole& p);
void wormOutputCode(PortHole& p);
CGTarget::wormInputCode,wormOutputCode
instead of the first two methods.
int run();If this target does not lie in a wormhole or it has only one processor, we just use
CGTarget::run
to generate code. Otherwise, we transfer data samples to and from the target: sendWormData
and receiveWormData.
int sendWormData();These are redefined protected methods. They send data samples to the current target and receive data samples from the current target. We traverse the wormhole portholes to identify all portholes in the sub-universes corresponding to them, and call
int receiveWormData();
sendWormData,
receiveWormData for them.
void wrapup();In this base class, we write code for each processor to a file.
ParProcessors* parProcs;This is a pointer to the actual scheduling object associated with the current parallel scheduler.
IntArray canProcs;This is an integer array to be used in
candidateProcs
to contain the list of processor indices.
virtual void resetResources();This method clears the resources this target maintains such as communication resources.
void updataRM(int from, int to);This method updates a reachability matrix for communication amortization. A reachability matrix is created if
amortizedComm
is set to YES. We can packetize communication samples only when packetizing does not introduce deadlock of the graph. To detect the deadlock condition, we conceptually cluster the nodes assigned to the same processors. If the resulting graph is acyclic, we can packetize communication samples. Instead of clustering the graph, we set up the reachability matrix and update it in all send nodes. If there is a cycle of send nodes, we can see the deadlock possibility.
isA
method defined for type identification.
Block* makeNew() const;Creates an object of CGMultiTarget class.
int execTime(DataFlowStar* s, CGTarget* t);This method returns the execution time of a star
s
if scheduled on the given target t.
If the target does not support the star, a value of -1 is returned. If it is a heterogeneous target, we consider the relative time scale of processors. If the second argument is NULL or it is a homogeneous multiprocessor target, just return the execution time of the star in its definition.
IntArray* candidateProcs(ParProcessors* par, DataFlowStar* s);This method returns a pointer to an integer array of processor indices. We search the processors that can schedule the argument star
s
by checking the star type and the resource requirements. We include at most one idle processor.
int commTime(int from, int to, int nSamples, int type);This method returns the expected communication overhead when transferring
nSamples
data from from
processor to to
processor. If type
= 2, this method returns the sum of receiving and sending overhead.
int scheduleComm(ParNode* comm, int when, int limit = 0);Since it models a fully-connected multiprocessor, we can schedule a communication star anytime without resource conflict that returns the second argument
when.
ParNode* backComm(ParNode* rcv);This method returns the corresponding send node paired with the argument receive node,
rcv.
If the argument node is not a receive node, return NULL.
int amortize(int from, int to);This method returns TRUE or FALSE, based on whether communication can be amortized between two argument processors.
This class has an object to model the shared bus.
UniProcessor bus;These are two protected members to save the current bus schedule and the best bus schedule obtained so far. The
UniProcessor bestBus;
bus
and bestBus
are copied to each other by the following public methods.
void saveCommPattern();The first method is a public method to clear
void restoreCommPattern();
clearCommPattern();
void resetResources()
bus
schedule, while the second is a protected method to clear both bus
and bestBus.
This classes redefines the following two public methods.
int scheduleComm(ParNode* node, int when, int limit = 0);This method schedules the argument node available at
when
on bus.
If we can schedule the node before limit,
we schedule the node and return the schedule time. Otherwise, we return -1. If limit
= 0, there is no limit on when to schedule the node.
ParNode* backComm(ParNode* node);For a given communication node, find another node scheduled just before the argument node on
bus.