Re: [SED-ML-discuss] Nested Proposal V3

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Frank et al.,

I've been thinking about this and playing around with ideas for the last 
few weeks; trying to see how it fits with our protocol language ideas 
too.  (Incidentally, our functional curation tool now supports (most of) 
SED-ML L1V1, and I should have the suggestions below implemented too in 
time for COMBINE.)

Regarding especially the two points for discussion, I think the best 
solution is actually to have 3 concrete subclasses of an AbstractTask: 
ModelTask, NestedTask (or RepeatedTask if that name is preferred), and 
CombinedTask.  Frank's current RepeatedTask proposal combines aspects of 
all three of these concepts into one class, and I think things are 
clearer and easier to implement if they are separated.

  * ModelTask provides the current L1V1 Task - it associates a model
    with a simulation algorithm. This would be the only task with a
    simulationReference and modelReference.
  * NestedTask provides the ranges, changes, and a single subtask that
    gets run repeatedly. The option to iterate directly over a
    simulation by providing a simulationReference and modelReference can
    be expressed just as well by providing a ModelTask as subtask, and
    avoids conflating different behaviours in one class.
  * CombinedTask allows tasks to be grouped: it has a listOfSubtasks,
    and a scheduling attribute that specifies whether they must be run
    in the order given (sequential) or whether order is immaterial and
    hence they could be parallelised (parallel).

CombinedTask allows you to structure dependencies between tasks that are 
as complex as you like, by having CombinedTasks as subtasks of another 
CombinedTask.  Personally I think it's also clearer and easier to 
implement than having either an order attribute or listing task 
dependencies.

Of course, this does change the Task class hierarchy.  But since the 
L1V1 Task class still exists unchanged in my proposed hierarchy (it's 
just not at the top) I don't think that's too much of an issue even if 
we wanted this to go in L1Vx; for L2 I think achieving a clean (and 
hence more easily comprehensible) design should be the most important 
concern.

Now to more low-level details.

  * The 'range' attribute on repeatedTask seems to be redundant and
    potentially confusing, especially as there can be multiple ranges. 
    What's the behaviour if there are multiple unrelated ranges with
    different numbers of values? Presumably we should enforce that all
    ranges have the same number of values, and are all iterated over
    simultaneously, hence just providing different values for the
    benefit of changes. Since individual changes can specify range(s) of
    interest, "blessing" one range doesn't seem necessary.
  * There are several methods presented for referring to a range value
    from a setValue change. I prefer the way shown in the examples on
    page 5 of Frank's proposal, where a variable element is used to link
    the range id to a variable id for use in the math block - this seems
    most in line with existing SED-ML behaviour, and also allows a
    change to reference multiple ranges cleanly, which the approach of
    giving a range attribute on the setValue element doesn't.
  * The same comment applies to the index attribute on a
    functionalRange. Again you could be explicit about it by defining a
    variable.  I can see the benefit of keeping the short-hand method
    for common cases, but suggest that it is explicitly defined as being
    a short-hand.
  * With resetModel, is the intention that it resets to the initial
    conditions of the model, or the model's state as it existed at the
    start of the task?  I think from page 4 you envisage the former, but
    the latter seems more flexible to me (and not much harder to
    implement). It would allow you, for instance, to have a sequence of
    tasks in which the first gets the model to steady state, and the
    second does a repeated simulation from that point.
  * I don't particularly like having a step attribute on oneStep - I'd
    much rather have a simulation class on which I can specify the
    desired end point to reach.
      o The main reason for this is that in moderately long-running
        simulations, the way in which you calculate time points can
        noticeably affect the results you get: if you use repeated
        addition of a time-step, accumulated floating point error can
        even mean you don't end up at the expected overall end time
        exactly, and have to take a very small or slightly larger final
        step.  If instead you compute output time points as
        num_steps*dt, they're computed more accurately and you don't get
        any issues for the final step (provided of course that dt
        divides the overall interval).
      o Another advantage to specifying end point rather than step is
        that you don't have the "step" information duplicated in both
        the range of the task and the step of the simulation. The task's
        range naturally provides the value up to which you want to solve
        in all the cases I can think of.
      o Specifying the end point based on the range value does require,
        however, that you can parameterise your simulation and/or task
        classes, and that setValue in a NestedTask can address these
        parameters as well as those in the model. For the latter point,
        we could extend the target attribute to allow values such as
        '#id' as well as XPath expressions; '#id' is not valid XPath so
        there's no potential for confusion. Whereas an XPath expression
        would address a model variable/parameter, the use of #id would
        select the protocol parameter with matching id. Specifying what
        parameters are available would require some more work, possibly
        building on Richard's proposal
        <http://sourceforge.net/tracker/?func=detail&aid=3391892&group_id=293618&atid=2532228>.
  * I also have a question on which I'd like input from others with more
    experience: the oneStep proposal specifies that it "defines a
    simulation step of a deterministic simulation." Is there any reason
    why you can't run a stochastic simulation up to a desired end
    point?  There is also a related use case that might be worth
    considering here: what if you're simulating a cell-cycle model, and
    wish to run up until the model indicates cell division should occur?
    In this case you don't know what the ultimate end point will be, and
    you'll almost certainly want to resolve it to greater precision than
    a repeatedTask range or timecourseSimulation output granularity.

Best wishes,
Jonathan

On 10/06/2012 11:43, Frank T. Bergmann wrote:
> Hello together,
>
> I have modified my original proposal and added the suggestions made by
> Nicolas and the discussions at HARMONY 2012. The key changes:
>
> - instead of a NestedSimultion class a RepeatedTask is used
> - the SteadyState simulation class is back
>
> The proposal can be found here:
>
> http://identifiers.org/combine.specifications/sed-ml.proposal.nested-simulat
> ions.FB.version-3
>
> And an implementation is available with the SED-ML Web Tools:
>
> http://sysbioapps.dyndns.org/SED-ML_Web_Tools/
>
> where all examples are available to be run.
>
>
> Points for discussion:
> --------------------------
> - Stuart brought up the idea of having subTasks not specifying an order
> attribute
>    but instead they would describe the tasks they depend on. The idea being
> that
>    this would allow for a potential parallel execution of tasks.
>
> - Should we have further task subclasses or not?
>
> I look forward to your comments.
>
> Cheers
> Frank
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> SED-ML-discuss mailing list
> SED...@li...
> https://lists.sourceforge.net/lists/listinfo/sed-ml-discuss