KAMon: A Kepler Module for Runtime Monitoring of Scientific Workflows

KAMon: A Kepler Module for Runtime Monitoring of Scientific Workflows
Faraaz Sareshwala

Citation
Faraaz Sareshwala. "KAMon: A Kepler Module for Runtime Monitoring of Scientific Workflows". Talk or presentation, 16, April, 2009; Poster presented at the 8th Biennial Ptolemy Miniconference.

Abstract
The Kepler scientific workflow systems allows scientists to quickly and easily assemble computational pipelines from predefined components (actors and subworkflows). Kepler workflows often involve computationally intensive steps (running, e.g., on a remote cluster), and may work over many large datasets. However, there is currently only limited support for execution monitoring in Kepler, which would be particularly helpful for such complex and long-running scientific workflows. For example, it is not obvious how to "animate" multi-threaded domains such as PN and COMAD (single-threaded domains such as SDF naturally allow animation). Similarly, the system shows no visible difference between when the workflow begins to when it is about to end. As workflows and the datasets they operate on continue to increase in size and complexity, it also becomes increasingly important to provide the workflow user with information on what is going on "behind the scenes". For long running-workflows, users assume the additional role of an "operator" and need to be able to identify performance bottlenecks and other trouble-spots (e.g., runaway queue sizes), or to simply monitor general execution progress. As a first step towards that end, we present a prototype Kepler extension module, KAMon (Kepler Activity Monitor), which allows scientists to monitor runtime activity and execution progress. KAMon allows to monitor certain key observables, e.g., firing duration and delay (time spent during or between firings, possibly inferred from token read/write operations), the total number of tokens processed (consumed or produced) on a port, and "token buildup" (i.e., number of tokens in a queue, waiting to be processed). Furthermore, we are able to calculate workflow progress under certain computational models and provide scientists with an estimated time to workflow completion. KAMon has already proven useful in practice to help with locating and analyzing a problem with concurrently executing workflow branches in COMAD. KAMon employs and combines independently contributed code (e.g., for observing token flow and for displaying observables in a monitoring window and on the canvass), and packages these extensions using the new Kepler build and extension module system. In future work, we plan to improve and extend KAMon further, e.g., to include more (even user-defined) observables and to provide post-execution support for benchmarking and profiling of workflows, a feature that would be useful in particular for production workflows (i.e., which are run repeatedly and routinely).

Electronic downloads

(No downloads are available for this publication.)

Citation formats

HTML

Faraaz Sareshwala. <a
href="http://chess.eecs.berkeley.edu/pubs/568.html"
><i>KAMon: A Kepler Module for Runtime Monitoring
of Scientific Workflows</i></a>, Talk or
presentation,  16, April, 2009; Poster presented at the 8th
Biennial Ptolemy Miniconference.

Plain text

Faraaz Sareshwala. "KAMon: A Kepler Module for Runtime
Monitoring of Scientific Workflows". Talk or
presentation,  16, April, 2009; Poster presented at the 8th
Biennial Ptolemy Miniconference.

BibTeX

@presentation{Sareshwala09_KAMonKeplerModuleForRuntimeMonitoringOfScientificWorkflows,
    author = {Faraaz Sareshwala},
    title = {KAMon: A Kepler Module for Runtime Monitoring of
              Scientific Workflows},
    day = {16},
    month = {April},
    year = {2009},
    note = {Poster presented at the 8th Biennial Ptolemy
              Miniconference},
    abstract = {The Kepler scientific workflow systems allows
              scientists to quickly and easily assemble
              computational pipelines from predefined components
              (actors and subworkflows). Kepler workflows often
              involve computationally intensive steps (running,
              e.g., on a remote cluster), and may work over many
              large datasets. However, there is currently only
              limited support for execution monitoring in
              Kepler, which would be particularly helpful for
              such complex and long-running scientific
              workflows. For example, it is not obvious how to
              "animate" multi-threaded domains such as PN and
              COMAD (single-threaded domains such as SDF
              naturally allow animation). Similarly, the system
              shows no visible difference between when the
              workflow begins to when it is about to end. As
              workflows and the datasets they operate on
              continue to increase in size and complexity, it
              also becomes increasingly important to provide the
              workflow user with information on what is going on
              "behind the scenes". For long running-workflows,
              users assume the additional role of an "operator"
              and need to be able to identify performance
              bottlenecks and other trouble-spots (e.g., runaway
              queue sizes), or to simply monitor general
              execution progress. As a first step towards that
              end, we present a prototype Kepler extension
              module, KAMon (Kepler Activity Monitor), which
              allows scientists to monitor runtime activity and
              execution progress. KAMon allows to monitor
              certain key observables, e.g., firing duration and
              delay (time spent during or between firings,
              possibly inferred from token read/write
              operations), the total number of tokens processed
              (consumed or produced) on a port, and "token
              buildup" (i.e., number of tokens in a queue,
              waiting to be processed). Furthermore, we are able
              to calculate workflow progress under certain
              computational models and provide scientists with
              an estimated time to workflow completion. KAMon
              has already proven useful in practice to help with
              locating and analyzing a problem with concurrently
              executing workflow branches in COMAD. KAMon
              employs and combines independently contributed
              code (e.g., for observing token flow and for
              displaying observables in a monitoring window and
              on the canvass), and packages these extensions
              using the new Kepler build and extension module
              system. In future work, we plan to improve and
              extend KAMon further, e.g., to include more (even
              user-defined) observables and to provide
              post-execution support for benchmarking and
              profiling of workflows, a feature that would be
              useful in particular for production workflows
              (i.e., which are run repeatedly and routinely).},
    URL = {http://chess.eecs.berkeley.edu/pubs/568.html}
}

Posted by Christopher Brooks on 17 Apr 2009.
Groups: ptolemy
For additional information, see the Publications FAQ or contact webmaster at chess eecs berkeley edu.

Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright.