WorkflowTemplates

Bernd Schuller Maria Petrova
Attachments
WfTemplate.png (38092 bytes)
wf-URC.png (9130 bytes)

Workflow Templates

In many cases, complex scientific workflows are created once (by an expert user), and then executed many times with only a few changes to parameters or other input data. For setting parameters and executing such a workflow, a full workflow editor is not required. For non-expert users, a full workflow editor can even be hard to use, since it can be difficult to find the correct parameter or to select an input file.

Thus, we came up with the idea of workflow templates. Here, only a few parameters in a complex workflow are left editable. The execution of UNICORE workflows is made easy and comfortable: the user needs to load the template, set a few parameters and the workflow is ready for submission. This process is especially beneficial for users who are less familiar with the workflow creation and execution process.

Workflow templates can be submitted from the UNICORE Portal (v2.2.0 or later) or UCC (v7.5.0 or later).

Here we show how to create your own template and then how to import and execute it in the Portal.

How to create your own workflow template

While it is possible to create a workflow from scratch, the usual way is to export it from the Rich Client (or Portal), and tweak it manually. Up to date, there is no full-featured automatic export of a workflow template.

1) Create your workflow in the workflow editor of the Rich Client or the Web Portal.

2) Export the created workflow to an XML file (in the Rich Client, "Export to submittable workflow")

3) Open the file in a text editor

4) Within the "Documentation" tag you can define an arbitrary number of Parameters, which can have name, type, description, a default value and other settings, as shown below.

Example with multiple parameters:

<Documentation xmlns:jsdl-u="http://www.unicore.eu/unicore/jsdl-extensions">
      <jsdl-u:Argument>
            <jsdl-u:Name>INPUT_FILE</jsdl-u:Name>
            <jsdl-u:ArgumentMetadata>
                   <jsdl-u:Type>filename</jsdl-u:Type>
                   <jsdl-u:Description>Input file</jsdl-u:Description>
            </jsdl-u:ArgumentMetadata>
      </jsdl-u:Argument>

     <jsdl-u:Argument>
            <jsdl-u:Name>TASK_SIZE</jsdl-u:Name>
            <jsdl-u:ArgumentMetadata>
                   <jsdl-u:Type>string</jsdl-u:Type>
                   <jsdl-u:Default>medium</jsdl-u:Default>
                   <jsdl-u:IsEnabled>true</jsdl-u:IsEnabled>
                   <jsdl-u:ValidValue>small</jsdl-u:ValidValue>
                   <jsdl-u:ValidValue>medium</jsdl-u:ValidValue>
                   <jsdl-u:ValidValue>large</jsdl-u:ValidValue>
            </jsdl-u:ArgumentMetadata>
     </jsdl-u:Argument>
</Documentation>

Note that the syntax is the same as for application metadata on the server, see https://unicore-dev.zam.kfa-juelich.de/documentation/unicorex-7.6.0/unicorex-manual.html#_application_metadata_simple

5) Once you have defined the parameters, you can use them in the XML file in the form ${parametername}, the user value will be inserted automatically before submission.

Example how to use the parameter values. The example shows the case of a file but it is identical for the other parameters:

<jsdl:Application>
     <jsdl:ApplicationName>Python Script</jsdl:ApplicationName>
     <jsdl1:POSIXApplication>
             <jsdl1:Environment name="SOURCE">input.py</jsdl1:Environment>
     </jsdl1:POSIXApplication>
</jsdl:Application>
<jsdl:DataStaging>
       <jsdl:FileName>input.py</jsdl:FileName>
       <jsdl:CreationFlag>dontOverwrite</jsdl:CreationFlag>
       <jsdl:Source>
               <jsdl:URI>${INPUT_FILE}</jsdl:URI>
       </jsdl:Source>
</jsdl:DataStaging>

Notice as well that you have to make sure that the name of the file in case of a file parameter in the <jsdl1:Environment name="SOURCE"> tag is the same as the <jsdl:FileName> in the data staging, whereas the URI tag is the one that will receive the file path (which is the parameter) at submission time.

Note: When the input files of the Rich Client are locally located, the exported file does not contain the data staging statement for source URI because it assumes that they are local for the URC. Currently, you need to manually add the missing data staging statements for input files in this use case.

There are three ways to handle these local input files in a workflow template.
1) if the input file is fixed and does not need to change for each workflow run, you can put it directly into the workflow. This will work nicely for small files, but not for larger files.

<jsdl:DataStaging>
       <jsdl:FileName>input.py</jsdl:FileName>
       <jsdl:CreationFlag>dontOverwrite</jsdl:CreationFlag>
       <jsdl:Source>
               <jsdl:URI>inline://data</jsdl:URI>
       </jsdl:Source>
        <uc:InlineData xmlns:uc="http://www.unicore.eu/unicore/jsdl-extensions">
        <![CDATA[print("Hello World")
        ]]> 
        </uc:InlineData>
</jsdl:DataStaging>

2) input data can be staged in from a fixed location (UNICORE storage or other supported staging location)

<jsdl:DataStaging>
       <jsdl:FileName>input.py</jsdl:FileName>
       <jsdl:CreationFlag>dontOverwrite</jsdl:CreationFlag>
       <jsdl:Source>
               <jsdl:URI>BFT:https://locahost:8080/DEMO-SITE/services/StorageManagement?res=user-Home#/infile</jsdl:URI>
       </jsdl:Source>
</jsdl:DataStaging>

3) input data files can be a parameter, letting the user choose it before submission, as shown above.

How to import and submit a workflow template in the Web Portal

1) Log in the Portal and go to “Create Job” view

2) Select Workflow Template from the application combo box
3) Upload the template file
4) Fill in the values in the auto-generated parameter fields
5) Submit as if you are submitting a normal job with the exception that the wizard will ask you to choose a workflow engine as a middle step.
6) Fetch your results in My Workflows view

An exemplary workflow template in XML

Original simple workflow in the URC

Template which allows setting the file for the Python Job and the Verbose parameter for the Bash job:

<?xml version="1.0" encoding="UTF-8"?>
<Workflow xmlns="http://www.chemomentum.org/workflow/simple" xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl" xmlns:jsdl1="http://schemas.ggf.org/jsdl/2005/11/jsdl-posix" xmlns:grid="http://gpe.intel.com/gridbeans">
    <Documentation xmlns:jsdl-u="http://www.unicore.eu/unicore/jsdl-extensions">
      <jsdl-u:Argument>
        <jsdl-u:Name>Python_INPUT</jsdl-u:Name>
        <jsdl-u:ArgumentMetadata>
           <jsdl-u:Type>filename</jsdl-u:Type>
        </jsdl-u:ArgumentMetadata>
      </jsdl-u:Argument>
      <jsdl-u:Argument>
        <jsdl-u:Name>Bash_VERBOSE</jsdl-u:Name>
        <jsdl-u:ArgumentMetadata>
           <jsdl-u:Type>boolean</jsdl-u:Type>
        </jsdl-u:ArgumentMetadata>
      </jsdl-u:Argument>
    </Documentation> 
    <Activity Id="Start" Type="START" Name="START"/>
    <Activity Id="Bash" Type="JSDL" Name="JSDL">
        <Option name="SPLIT">false</Option>
        <Option name="IGNORE_FAILURE">false</Option>
        <JSDL id="Bash">
            <jsdl:JobDescription>
                <jsdl:JobIdentification>
                    <jsdl:JobName>Bash</jsdl:JobName>
                </jsdl:JobIdentification>
                <jsdl:Application>
                    <jsdl:ApplicationName>Bash shell</jsdl:ApplicationName>
                    <jsdl1:POSIXApplication>
                        <jsdl1:WorkingDirectory>$TEMPORARY_DIRECTORY</jsdl1:WorkingDirectory>
                        <jsdl1:Environment name="VERBOSE">${Bash_VERBOSE}</jsdl1:Environment>
                        <jsdl1:Environment name="SOURCE">input.sh</jsdl1:Environment>
                        <jsdl1:Environment name="INPUT"/>
                        <jsdl1:Environment name="STANDARD_OUT">stdout</jsdl1:Environment>
                        <jsdl1:Environment name="STANDARD_ERROR">stderr</jsdl1:Environment>
                    </jsdl1:POSIXApplication>
                </jsdl:Application>
                <jsdl:DataStaging>
                    <jsdl:FileName>input.sh</jsdl:FileName>
                    <jsdl:CreationFlag>dontOverwrite</jsdl:CreationFlag>
                    <jsdl:Source>
                        <jsdl:URI>inline://foo</jsdl:URI>
                    </jsdl:Source>
                    <uc:InlineData xmlns:uc="http://www.unicore.eu/unicore/jsdl-extensions"><![CDATA[#!/bin/sh
                  hostname
                  ]]> 
                  </uc:InlineData>
                </jsdl:DataStaging>
                <jsdl:DataStaging>
                    <jsdl:FileName>stdout</jsdl:FileName>
                    <jsdl:CreationFlag>dontOverwrite</jsdl:CreationFlag>
                    <jsdl:Target>
                        <jsdl:URI>c9m:${WORKFLOW_ID}/Bash/stdout</jsdl:URI>
                    </jsdl:Target>
                </jsdl:DataStaging>
                <jsdl:DataStaging>
                    <jsdl:FileName>stderr</jsdl:FileName>
                    <jsdl:CreationFlag>dontOverwrite</jsdl:CreationFlag>
                    <jsdl:Target>
                        <jsdl:URI>c9m:${WORKFLOW_ID}/Bash/stderr</jsdl:URI>
                    </jsdl:Target>
                </jsdl:DataStaging>
            </jsdl:JobDescription>
        </JSDL>
    </Activity>
    <Activity Id="Python" Type="JSDL" Name="JSDL">
        <Option name="SPLIT">false</Option>
        <Option name="IGNORE_FAILURE">false</Option>
        <JSDL id="Python">
            <jsdl:JobDescription>
                <jsdl:JobIdentification>
                    <jsdl:JobName>Python</jsdl:JobName>
                </jsdl:JobIdentification>
                <jsdl:Application>
                    <jsdl:ApplicationName>Python Script</jsdl:ApplicationName>
                    <jsdl1:POSIXApplication>
                        <jsdl1:WorkingDirectory>$TEMPORARY_DIRECTORY</jsdl1:WorkingDirectory>
                        <jsdl1:Environment name="SOURCE">input</jsdl1:Environment>
                        <jsdl1:Environment name="INPUT"/>
                        <jsdl1:Environment name="STANDARD_OUT">stdout</jsdl1:Environment>
                        <jsdl1:Environment name="STANDARD_ERROR">stderr</jsdl1:Environment>
                    </jsdl1:POSIXApplication>
                </jsdl:Application>
                <jsdl:DataStaging>
                    <jsdl:FileName>input</jsdl:FileName>
                    <jsdl:CreationFlag>dontOverwrite</jsdl:CreationFlag>
                    <jsdl:DeleteOnTermination>true</jsdl:DeleteOnTermination>
                    <jsdl:Source>
                        <jsdl:URI>${Python_INPUT}</jsdl:URI>
                    </jsdl:Source>
                </jsdl:DataStaging>
                <jsdl:DataStaging>
                    <jsdl:FileName>stdout</jsdl:FileName>
                    <jsdl:CreationFlag>dontOverwrite</jsdl:CreationFlag>
                    <jsdl:DeleteOnTermination>false</jsdl:DeleteOnTermination>
                    <jsdl:Target>
                        <jsdl:URI>c9m:${WORKFLOW_ID}/Python/stdout</jsdl:URI>
                    </jsdl:Target>
                </jsdl:DataStaging>
                <jsdl:DataStaging>
                    <jsdl:FileName>stderr</jsdl:FileName>
                    <jsdl:CreationFlag>dontOverwrite</jsdl:CreationFlag>
                    <jsdl:DeleteOnTermination>false</jsdl:DeleteOnTermination>
                    <jsdl:Target>
                        <jsdl:URI>c9m:${WORKFLOW_ID}/Python/stderr</jsdl:URI>
                    </jsdl:Target>
                </jsdl:DataStaging>
            </jsdl:JobDescription>
        </JSDL>
    </Activity>
    <Transition Id="1" From="Start" To="Python"/>
    <Transition Id="2" From="Python" To="Bash"/>
</Workflow>

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks