A workflow aims to accomplish a complex piece of work. In Astrogrid terms this would be an astronomical investigation. If the work could be done in one simple step (eg: by executing a single program from the command line), then it would hardly be a workflow. A prime dimension of a workflow is therefore that the objective will take more than one step to accomplish - it has a degree of complexity.
A simple example of a workflow might consist of the following steps:
Astrogrid tries to deal with the complexity of workflows. But it is not an easy thing to eliminate. The following tries to give the salient points.
We hold workflow designs as XML documents. This is a
particularly verbose way of describing an individual
workflow, but is convenient for its rigour. Examples are
shown
here
. There is no necessity to interact directly with a workflow
xml document and therefore there is no necessity to be
intimately equainted with it; the portal design tool hides
most of the intricacies. However, some exposure to it is
probably useful. Workflow structure is described by its XML
schema: every workflow document has to obey the rules
embodied in the schema. The details of the workflow schema
are described
here
, but come back to this later.
Workflow designs are held as files (as XML documents) and stored in MySpace for Astrogrid purposes. They can be copied or shared. At the end of the day they are just files.
A workflow must be designed by an astronomer with a particular goal in mind. The process of design may be anything from the straightforward to the immensely complex. Even a relatively simple workflow consisting of a small number of steps may involve an astronomer in an iterative process of design, running the workflow against data, examining the results for feedback, then refining the design. A good workflow may therefore be something of a capital investment, and worth sharing with others.
There are two aspects worth noting.
In designing a workflow, we need specialized information regarding each of the steps we design. We tend to refer to data that itself describes astronomical data, or data that describes the working of an astronomical tool, by the term metadata. Most often you hear the terms catalogue metadata and tools metadata. A workflow design contains metadata or references to metadata within it.
We embody metadata within a workflow in various forms:
When a workflow begins execution, we treat it as a job. A workflow becomes a job by being submitted to the Job Execution System. Each submission represents a separate job. If you submit one workflow fifty times, you have fifty jobs, and one workflow. We keep a separate record for each job, a fact which is reflected in the workflow schema. In effect, a job is a specialized workflow document, which contains not only the design (as it was when it was submitted) but run-time information about a specific execution down to the individual step level.
The Job Execution System is the part of Astrogrid that controls the execution of a workflow. Otherwise known as JES, it is Astrogrid's workflow engine. JES is an engine that can manage jobs consisting of multiple steps, where individual steps can be run on different computers across a network. JES will attempt to run steps in a controlled fashion (eg: in parallel or sequentially depending upon the workflow definition). Because steps can execute at different locations, both control flow and data flow have to be managed. A partner component, known as the Common Execution Controller (CEC), manages step execution and any data flow required.
All steps in Astrogrid workflow execute asynchronously. That means that a step dispatched by JES is done so on a non-blocking basis.
CEA stands for Commom Execution Architecture. It is our standard for formalizing the web-interface provided by all tools, no matter whether they are implemented as Java code, commandline applications, or another SOAP / HTTP-GET based web service. The set of CEA applications may be split into datacenters and processing tools . Datacenters support complex queries against astronomical archives, while processing tools consume files of data and reduce them in some way. Although these are two very different animals, both can be thought of as tools.
CEC implementations inform JES when a step has completed. This is because steps are dispatched on an asynchronous basis: there is no waiting for a step to finish once it has been dispatched.
If you want to consider things as units of work, the smallest unit of work undertaken in a workflow is embodied in a single step. So a job will contain all the units of work sucessfully completed by the steps that got executed by JES. However, usually the appearance of a final result's file (eg, a specific VOTable), or a set of such files , within MySpace will be indicative. At the end of the job, or even at strategic points within it, an email can be sent. Whether a job is a formal success or failure can be programmed into the workflow design and interpreted by JES. Certainly from this point of view you can definitely tell when a job has failed. It will tell whether things have gone wrong, but not necessarilly whether the results produced in a successful run are what were expected. (At present we also have some problems with steps that never return.)